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Abstract 


Calibration,  more  generally  referred  to  as  inverse  estimation,  is  an  important  and 
controversial  topic  in  statistics.  In  this  work,  both  semiparametric  calibration  and 
the  application  of  calibration  to  grouped  data  is  considered,  both  of  which  may  be 
addressed  through  the  use  of  the  linear  mixed-effects  model.  A  method  is  proposed  for 
obtaining  calibration  intervals  that  has  good  coverage  probability  when  the  calibration 
curve  has  been  estimated  semiparametrically  and  is  biased.  The  traditional  Bayesian 
approach  to  calibration  is  also  expanded  by  allowing  for  a  semiparametric  estimate  of 
the  calibration  curve.  The  usual  methods  for  linear  calibration  are  then  extended  to 
the  case  of  grouped  data,  that  is,  where  observations  can  be  categorized  into  a  finite 
set  of  homogeneous  clusters.  Observations  belonging  to  the  same  cluster  are  often 
similar  and  cannot  be  considered  as  independent;  hence,  we  must  account  for  within- 
subject  correlation  when  making  inference.  Estimation  techniques  begin  by  extending 
the  familiar  Wald-based  and  inversion  methods  using  the  linear  mixed-effects  model. 
Then,  a  simple  parametric  bootstrap  algorithm  is  proposed  that  can  be  used  to  either 
obtain  calibration  intervals  directly,  or  to  improve  the  inversion  interval  by  relaxing 
the  normality  constraint  on  the  approximate  predictive  pivot.  Many  of  these  methods 
have  been  incorporated  into  the  R  package,  investr,  which  has  been  developed  for 
analyzing  calibration  data. 
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TOPICS  IN  STATISTICAL  CALIBRATION 


I.  Introduction 

An  important  part  of  statistics  is  building  mathematical  models  to  help  describe 
the  relationship  between  variables.  Often,  we  have  a  dependent  variable  y,  called 
the  response,  and  an  independent  variable  A,  called  the  predictor.  The  researcher 
designs  an  experiment  and  procures  n  observed  pairs  (xi,yi), . . . ,  (xn,yn)  that  are 
then  used  to  fit  a  regression  model.  The  fitted  regression  model  is  often  used  to  make 
predictions.  In  particular, 

•  predict  an  individual  response  for  a  given  value  of  the  predictor; 

•  estimate  the  mean  response  for  a  given  value  of  the  predictor. 

Sometimes,  however,  the  researcher  is  interested  in  the  reverse  problem.  That  is, 
there  is  a  need  to  estimate  the  predictor  value  from  an  observed  value  of  the  response 
( calibration )  or  a  specified  value  of  the  mean  response  (regulation) .  Both  are  referred 
to  more  generally  as  inverse  estimation.  Calibration  has  also  been  referred  to  in 
the  literature  as  inverse  prediction ,  inverse  regression,  and  discrimination.  In  this 
paper,  we  discuss  calibration,  in  particular,  univariate  calibration.  For  an  overview 
on  topics  in  multivariate  calibration,  see  Brown  (1982),  Brown  and  Sundberg  (1987), 
and  Brown  (1993).  Point  estimation  is  reviewed  in  Section  3.2,  interval  estimation  in 
Section  3.3,  and  Bayesian  calibration  in  Section  3.4. 

A  calibration  experiment  typically  consists  of  two  stages.  In  the  first  stage,  n 
observations  ( Xi,yi )  are  collected  (henceforth  referred  to  as  the  standards )  and  used 
to  fit  a  regression  model  Ej^lx}  =  p.  The  fitted  model,  denoted  p,  is  often  referred 
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to  as  the  calibration  curve  or  standards  curve.  In  the  second  stage,  m  (m  >  1)  values 
of  the  response  are  observed  with  unknown  predictor  value  Xq  (henceforth  referred 
to  as  the  unknowns) .  The  goal  is  to  use  the  calibration  curve  p  to  estimate  xq.  The 
following  example  will  help  to  clarify  the  basic  idea. 

Suppose  a  new  procedure  has  been  developed  for  measuring  the  concentration  (in 
pg/ ml)  of  arsenic  in  water  samples  that  is  cheaper,  but  less  accurate  than  the  existing 
method.  An  investigator  has  procured  32  water  samples  containing  preselected 
amounts  of  arsenic  x%  and  subjected  them  to  the  new  method  producing  measured 
concentrations  p*.  The  standards,  taken  from  Graybill  and  Iyer  (1994),  are  plotted  in 
Figure  1.1.  A  new  water  sample  is  then  obtained  with  unknown  arsenic  concentration 
Xq  and  subjected  to  the  new  method,  producing  a  measured  concentration  of  3.0 
pg/ ml.  It  is  desired  to  estimate  the  true  concentration  of  arsenic  in  the  newly  obtained 
water  sample. 

The  goal  of  this  work  is  to  extend  the  methods  of  calibration  to  more  complicated 
settings.  In  Chapter  4,  we  introduce  semiparametric  calibration.  Here  we  use 
scmiparametric  regression  methods  to  estimate  the  calibration  curve  and  make 
inference  on  the  unknown  Xo-  The  benefit  of  this  approach  is  that  we  do  not  have 
to  specify  the  exact  form  of  the  calibration  curve.  The  downside  to  this  approach  is 
that  the  estimated  calibration  curve  will  be  biased,  hence,  our  inference  about  the 
unknown  Xq  will  also  be  biased.  We  correct  for  this  bias  by  taking  the  mixed  model 
approach  to  smoothing  described  in,  for  example,  Ruppert  and  Wand  (2003).  A 
small  Monte  Carlo  study  shows  that  this  correction  is  necessary  to  obtain  calibration 
intervals  with  coverage  probability  near  the  nominal  1  —  a  level.  We  also  extend 
the  method  of  Bayesian  linear  calibration  put  forth  by  Hoadley  (1970)  by  allowing 
for  the  calibration  curve  to  be  estimated  semiparametrically  as  in  Crainiceanu  et  al. 
(2005).  In  Chapter  5,  we  extend  the  classical  methods  of  calibration  to  work  with 
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Actual  ((ig/ml) 

Figure  1.1:  Scatterplot  of  the  arsenic  data. 

linear  mixed-effects  models.  We  also  propose  a  new  parametric  bootstrap  algorithm 
that  can  be  used  to  obtain  an  estimate  of  the  entire  sampling  distribution  of  the 
estimate  of  xq.  We  further  show  how  this  algorithm  can  also  be  used  to  improve  upon 
the  classical  methods  by  removing  normality  assumptions  that  may  not  be  satisfied 
in  practice.  We  use  several  real  datasets  to  demonstrate  and  compare  our  methods. 
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II.  Statistical  Background 


This  chapter  provides  an  overview  of  some  of  the  statistical  concepts  related  to 
inverse  estimation.  In  Section  2.1,  we  discuss  the  basic  notational  conventions  used 
throughout  this  dissertation.  Section  2.2  introduces  the  linear  regression  model.  The 
extension  to  nonlinear  regression  models  is  given  in  Section  2.3.  Section  2.4  is  devoted 
to  penalized  regression  splines,  a  type  of  semiparametric  regression  model  where  only 
part  of  the  model  is  specified.  Finally,  in  Section  2.5,  we  introduce  the  linear  mixed- 
effects  models,  an  extension  of  the  linear  model  (LM)  that  allows  for  some  of  the 
regression  coefficients  to  vary  randomly. 

2.1  Notation 

For  the  most  part,  random  variables  will  be  denoted  by  capital  letters  in  a 
calligraphic  font  (e.g.,  V)-  Vectors  will  be  denoted  by  bold,  lowercase  letters  and 
matrices  will  be  denoted  by  bold,  uppercase  letters.  When  convenient,  we  will  use 
the  following  notation  to  denote  row,  column,  and  diagonal  vectors/matrices: 


Xi\  =  (xi,  ...,xn) 


Xi\  =  diagjxi, . . .  ,xn}  . 


diag  J  i= 1 


If  X  is  a  random  variable,  then  X  ~  (/r,  a2)  simply  means  that  X  has  some 
distribution  with  mean  E{X}  =  /i  and  variance  VarjT’}  =  a2.  Estimators  and 
estimates  will  typically  be  denoted  by  the  same  Greek  letter  with  a  hat  symbol.  For 
example,  depending  on  the  context,  (3  may  represent  a  vector  of  estimators  or  their 
corresponding  estimates. 
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2.2  Linear  models 


The  linear  regression  model  has  been  a  mainstay  of  statistics  for  many  years.  It 
has  the  simple  form 

(2.1)  3d  =  X'f3  +  €i,  i  =  1, . . . ,  n, 

where  Xi  =  (xn is  a  p  x  1  vector  of  predictor  variables  for  the  i-t\i 
observation,  (3  =  (/?!,... ,  f3p)'  is  a  p  x  1  vector  of  fixed  (but  unknown)  regression 
coefficients,  and  e*  ~  (0,  of).  Thus,  an  alternative  formulation  is  to  specify  the  mean 
response 

E{yi\X}  =  X'(3  =  fil. 

Unless  stated  otherwise,  xt\  =  1  (i.e. ,  the  model  contains  an  intercept).  It  is  often 
convenient  to  work  with  the  matrix  form  of  (2.1),  which  is 

(2.2)  y  =  X(3  +  e ,  e  ~  (0,  of /„), 

where  y  =  (3d,  •  •  • ,  34 )7  is  an  n  x  1  vector  of  response  variables,  X  =  (Xi , . . . ,  Xp )' 
is  an  n  xp  matrix  of  predictor  variables  called  the  design  matrix ,  and  e  =  (ei, . . . ,  en)' 
is  an  n  X  1  vector  of  random  errors.  Equation  (2.1)  is  special  in  that  the  response  y 
is  a  linear  function  of  the  regression  parameters  (3. 

A  special  case  of  (2.1)  arises  when  p  =  2  and  the  distribution  for  the  errors  is 
normal.  That  is, 

(2.3)  3d  =  A)  +  PiXi  +  fj,  i  =  l,...,n, 

where  /3q  and  j3\  are  the  intercept  and  slope  of  the  regression  line,  respectively,  and 
€i  ~  A/"(0,  (7^).  This  is  called  the  simple  linear  regression  model  and  is  often  used  for 
analyzing  calibration  data. 

Another  special  case  of  the  linear  model  (2.1)  is  when  x%1  =  gj(xi),  j  —  1, . . .  ,p, 
where  each  gj(-)  is  a  continuous  function  such  as  \f-  or  log(-).  For  example,  a 
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polynomial  model  of  degree  p  has  the  form 


—  A)  +  /3\ Xi  +  fax'*  +  . . .  +  PpX?  +  6j,  i  —  1, ...  ,n. 

Notice  that  a  polynomial  model  is  linear  in  the  parameters  even  though  it  is  nonlinear 
in  the  predictor  variable;  hence,  it  is  a  linear  model. 

2.2.1  Estimating  the  model  parameters. 

Estimation  of  (3  in  the  linear  model  (2.1)  can  be  carried  out  via  least  squares  (LS). 
The  ordinary  LS  estimator  of  / 3  minimizes  the  residual  sum  of  squares, 

(2.4)  (3  =  argmin  || y  -  X(3\\ 2  =  (X’ X)~l  X’y. 

P 

If  we  make  the  additional  assumption  that  the  errors  are  normally  distributed,  then 
estimation  can  also  be  carried  out  by  the  method  of  maximum  likelihood  (ML).  This 
has  the  benefit  of  simultaneously  providing  an  estimator  for  both  (3  and  of.  To 
proceed,  we  need  to  maximize  the  likelihood 

C  (l3,a‘\y)  =  (2^)',/’Bp  j~T  ||y  _  x&f  | , 

or  equivalently,  maximize  the  log-likelihood 

&  (/3,  oil y)  =  log  (2tt)  -  |  log  (of)  -  ^  II y-  XP\\2  • 

The  derivatives  of  2£?(/3,  of|y)  with  respect  to  the  parameters  (/3,  of )  are 

d&  _  X’{y-X(3) 
d(3  of 

d &  _  \\y  —  X/3\\2  _  N 
dcr 1  2al  2 a2e  ’ 

which,  upon  setting  equal  to  zero  and  solving  yields  the  ML  estimators 


Fortunately,  for  the  linear  model  (2.1)  with  normal  errors,  the  ML  estimator  of  / 3  is 
the  same  as  the  LS  estimator.  It  is  customary  to  adjust  the  ML  estimator  of  of  for 


y-x[3 


/( n-p -  1). 


bias  by  replacing  it  with  of  = 

2.2.2  Predictions. 

A  common  use  of  the  fitted  regression  equation  is  to  make  predictions.  There 
are  two  types  of  predictions  we  distinguish: 


(1)  estimate  the  mean  response  when  X  =  Xo; 

(2)  predict  a  future  observation  corresponding  to  X0. 


Let  /i0  and  jho  be  the  mean  response  and  future  observation  of  interest,  respectively. 
We  will  see  that  the  point  estimators  of  /r0  and  jho  are  the  same,  namely  the  fitted 
value  jl(x).  Intuitively,  the  former  should  have  a  smaller  standard  error  since  there  is 
less  variability  in  estimating  a  fixed  population  parameter  than  in  predicting  a  future 
value  of  a  random  variable.  Consequently,  a  100(1  —  a)%  confidence  interval  for  /r0  at 
X0  will  always  be  smaller  than  a  100(1  —  a)%  prediction  interval  for  jho  corresponding 
to  Xio- 

Let  X0  be  an  arbitrary  value  of  X.  Suppose  we  are  interested  in  estimating  the 
mean  of  y  given  X0,  E  {32|J\T0}  =  p0.  For  the  linear  model  (2.1),  the  best  linear 
unbiased  estimator  (BLUE)  of  /io  is  the  fitted  value  yU0  =  X'aj3.  Furthermore,  it  is 
easy  to  show  that 

E  {ho}  =  X'0f3  =  jiQ 


and 


Var  {/20}  =  el  X'0  (X'X)-1  X 


=  S2 


Assuming  normal  errors,  it  follows  that 


ho  —  ho 


aJ[X'„(X'X)-'  X0] 


T{n  —  p). 
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Hence,  a  100(1  —  a)%  confidence  interval  for  the  mean  response  /x0  is  given  by 


(2.5) 


Let  33)  denote  an  individual  or  future  observation  at  the  given  point  X{).  We 
assume  that  33)  =  X'0 j3  +  eo,  where  eo  ~  A/"(0,  of)  and  is  independent  of  e.  The  best 
predictor  of  33)  is  Vo  =  X'0 /3,  the  same  as  /r0,  however,  the  variance  of  33)  is  wider: 
Var  1 33)}  =  of  +  Var{/io}.  This  is  intuitive  since  there  is  greater  uncertainty  in 
predicting  the  outcome  of  a  continuous  random  variable  than  in  estimating  a  fixed 
(population)  parameter,  such  as  its  mean.  Under  the  assumption  of  normal  errors, 
we  have  that 


7/n  —  "Vn 


therefore,  a  100(1  —  a)%  prediction  interval  for  33)  is  simply 


(2.6) 


We  call  (2.6)  a  prediction  interval,  as  opposed  to  a  confidence  interval,  since  yo  is  a 
prediction  for  the  outcome  of  the  random  variable  33) • 

2.3  Nonlinear  models 

Often  in  practice  there  is  an  underlying  theoretical  model  relating  the  response  to 
the  predictors,  and  this  model  may  be  nonlinear  in  the  parameters,  (3.  Such  nonlinear 
relationships  lead  us  to  the  nonlinear  regression  model.  For  a  single  predictor  variable, 
this  model  is 


(2.7) 


y,  =  n(xi\ p)  +  ei, 


where  /i(-)  is  a  known  expectation  function  that  is  nonlinear  in  at  least  one  of  the 
parameters  in  /3,  and  e*  ~  A/"(0,  of). 


2.3.1  Estimating  the  model  parameters. 

Borrowing  from  the  notation  in  Seber  and  Wilde  (2003),  let  Pi(/3)  =  g(xi,  (3), 


and 


M(/3)  =  (Ri  (P),---,3n(P))'  , 


D((3)  = 


dy(P) 

d(3' 


dthiP)  Y  Y 

aft  1  j 


For  convenience,  let  D  =  D{P). 

The  approach  to  estimating  P  in  the  nonlinear  model  (2.7)  is  similar  to  the 
approach  in  the  linear  model  (2.1).  That  is,  we  choose  the  value  of  [3  that  minimizes 
the  residual  sum  of  squares,  RSS(/3),  defined  by 


(2.8) 


RSS(/3)  =  ||V-m(/3)H2. 


However,  since  /i(-)  is  nonlinear  in  the  parameters  /3,  minimizing  Equation  (2.8) 
requires  iterative  techniques  such  as  methods  of  steepest  descent  or  the  Gauss-Newton 
algorithm,  which  we  describe  below. 

Given  a  starting  value  or  current  guess  P^  of  the  value  of  P  that  minimizes 

(2.8) ,  we  can  approximate  p(xp/3)  with  a  Taylor-series  approximation  around  P^°\ 
From  a  first-order  Taylor  series  expansion,  we  get 

(2.9)  M/3)  »  Mft”1)  +  D(P<®)'(P  -  ft0’), 


The  derivative  matrix  D(P )  plays  the  same  role  as  the  design  matrix  X  in  the  linear 
model  (2.1),  except  that  D(P)  may  depend  on  the  unknown  regression  parameters 
P-  Substituting  (2.9)  into  Equation  (2.8),  we  get 


(2.10) 

(2.11) 


RSS (p)  «  \\y  -  tx(pW)  -  D(p®)(p  -  /3(0)) ||2 
=  y{0)  -X(0)(/3-/3(0))  2, 
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where  y(0)  =  y  —  ^i(/3^),  and  =  Z)(/3^0^).  Equation  (2.10)  is  of  the  same  form 
as  the  sum  of  squares  for  the  linear  model.  Based  on  this  approximation,  the  LS 
estimate  of  (3  is  then 


P 


/3(0)  + 


-i 


x^'x^  x^'y^ 


Hence,  given  a  current  approximation  f3(k')  of  (3,  the  updated  approximation  is 


(3{k+ 1)  =  (3{k)  + 


-i 


x^y  c)  xWy^. 


This  process  is  iterated  until  a  suitable  convergence  criterion  is  met.  Just  as  for  the 
linear  model,  if  we  assume  the  errors  are  normally  distributed,  then  the  ML  estimate 
of  (3  is  the  same  as  the  LS  solution. 

2.3.2  Predictions. 

Let  xq  be  a  known  value  of  the  predictor  xq.  The  estimate  of  the  mean  response 
Q^o}  =  n(x o;/3)  is  /20  =  /i(x o’,P).  Furthermore,  assuming  normal  errors,  an 
approximate  100(1— a)%  conhdence  interval  for  the  mean  response  /i0  can  be  obtained 
as  in  Equation  (2.5)  but  with  S2  computed  as 

S2  =  o2d’0(yD'Dyld0, 


where 


d(\  — 


d/i  (g0;  (3)  \ 
dPi 


P=P 


An  approximate  100(1  —  a)%  prediction  interval  for  a  future  observation  is 
similarly  obtained.  These  intervals,  however,  rely  on  the  same  linear  approximation 


used  to  compute  (3.  For  large  samples,  these  intervals  are  often  reliable.  When 
the  sample  size  is  small  or  there  is  a  lot  of  curvature,  these  intervals  can  be  highly 
inaccurate.  The  bootstrap  (Efron,  1979)  provides  an  alternative  method  of  inference 
under  these  circumstances. 
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2.4  Smoothing 

Let  x  =  (xi, . . . ,  xn )'  be  a  vector  of  predictor  values  and  y  =  (3d,  •  •  • ,  34) 7  be  a 
vector  of  response  variables.  We  assume  that  X  and  y  are  related  by  the  regression 
model 

y  =  fi(x)  +  e,  e~W(0  ,of/), 

where  //(•)  is  an  unknown  smooth  function.  In  this  section,  we  discuss  a  technique 
for  estimating  //(•)  nonparametrically,  often  referred  to  as  scatterplot  smoothing,  or 
just  smoothing.  In  particular,  we  will  focus  on  an  important  class  of  smoothers  called 
linear  smoothers.  For  linear  smoothers,  the  prediction  at  any  point  x,  is  a  linear 
combination  of  the  response  values:  jl(x)  =  u /y,  where  is  a  constant  that  does  not 
depend  on  the  response  vector  y.  The  vector  of  fitted  values  jl  =  (n(x i), . . . ,  p(xn))' 
can  be  written  in  matrix  form  as  sy,  where  S  is  an  n  x  n  smoother  matrix.  Examples 
of  linear  smoothers  include: 

•  running-mean,  running-line,  and  running-polynomial  smoothers; 

•  locally- weighted  polynomial  smoothers  (i.e. ,  LOESS); 

•  spline-based  smoothers; 

•  kernel  smoothers. 

The  remainder  of  this  section  discusses  spline  smoothing,  specifically,  the  penalized 
regression  spline  (P-spline). 

Regression  splines  represent  the  mean  response  p(x)  as  a  piecewise  polynomial 
of  degree  p.  The  regions  that  define  each  piece  are  separated  by  special  breakpoints 
£1  <  . . .  <  £k  called  knots.  By  increasing  p  or  the  number  of  knots,  the  family  of 
curves  becomes  more  flexible;  thus,  as  shown  in  Figure  2.1,  we  can  easily  handle  any 
level  of  complexity. 
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X 

Figure  2.1:  Unnormalized  sine  function  example.  Scatterplot  of  100  observations 
generated  from  a  regression  model  with  mean  response  p(x)  =  sin(x)/x  (green  line) 
and  i.i.d.  A/"(0,  0.052)  errors.  The  solid  line  shows  a  quadratic  P-spline  fit  with 
smoothing  parameter  A  =  0.9986.  The  dotted  lines  indicate  the  position  of  the  knots 

6c- 


A  p-th  degree  spline  function  has  the  form 

K 

(2.12)  p{x)  =  j30  +  fax  +  . . .  +  f3pxp  +  ^  ak(x  ~  60+, 

k= 1 

where  the  notation  a+  denotes  the  positive  part  of  a,  that  is,  a+  =  a  ■  I  (a  >  0).  Many 
methods  for  choosing  the  number  and  location  of  the  knots  are  given  in  the  literature 
(see,  for  example,  Ruppert  (2002)  and  the  references  therein).  Let  nx  be  the  number 
of  unique  x*.  A  reasonable  choice  for  the  number  of  knots  (Ruppert  and  Wand,  2003, 
pg.  126)  is  K  =  min(nx/4,  35)  with  knot  locations 


6c  = 


k  +  1 
A' +  2 


-th  sample  quantile  of  the  unique  Xj,  k  —  1, . . . ,  K. 


The  idea  is  to  choose  enough  knots  to  capture  the  structure  of  fi(x).  If  both  p  and 
K  are  too  large,  we  run  the  risk  of  overfitting  (i.e. ,  low  bias  and  high  variance).  If  K 
is  too  small  then  the  resulting  fit  may  be  too  restrictive  (i.e.,  low  variance  but  high 
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bias).  There  is  rarely  the  need  to  go  beyond  a  cubic  polynomial  model  and  so  typical 
choices  for  p  are  1,  2,  or  3. 

In  matrix  form,  the  polynomial  spline  model  is 


(2.13) 

where 


y  =  X(3  +  e ,  e  ~  (0,  of  J), 

1  X\  ■■■  x\  (xi  -  6)+  •  •  •  (Xl  -  6r)+ 


X  = 


1  xn  ■■■  xp  (xn-  6)+  ...  (xn-  60+ 

Equation  (2.13)  has  the  same  form  as  the  linear  model  (2.2).  The  ordinary  least 
squares  fit,  however,  will  be  too  “wiggly.”  The  idea  behind  penalized  spline  regression 
is  to  shrink  the  coefficients  oy,  by  imposing  a  penalty  on  their  size,  thereby  limiting 
their  impact  on  the  estimated  response  curve.  The  estimated  coefficients  minimize 
the  penalized  residual  sum  of  squares 


(2.14) 


PSS  =  II y  -  Xf3 II2  +  \2p(3'D(3 , 


where  D  =  diag  {0(p+i)x(p+i),  Ikxk}-  The  penalized  least  squares  solution  is  then 

(2.15)  3a  =  argmin  \\y  -  X(3\\2  +  \2pf3'D{3  =  {X'X  +  \2pD)~l  X'y. 

P 

Here  A  >  0  is  a  smoothing  parameter  that  controls  the  wiggliness  of  the  fit.  Small 
values  of  A  produce  wiggly  curves  while  larger  values  produce  smoother  curves.  The 
term  A 2p(3' Df3  is  called  the  roughness  penalty.  If  D  =  I,  then  the  penalized  least 
squares  solution  (2.15)  is  equivalent  to  the  ridge  regression  estimate  of  (3.  The 
roughness  penalty  can  also  be  written  as  X2p  || ck || 2,  where  «  =  (aq, . . . ,  oik)'  is 
the  vector  of  coefficients  for  the  spline  basis  functions.  Hence,  P-splines  enforce  a 
penalty  on  the  £2-norm  of  ck,  so  none  of  the  polynomial  coefficients  are  penalized  (See 
Figure  2.2)! 
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Figure  2.2:  Profiles  of  spline  coefficients  as  the  smoothing  parameter  A  is  varied. 
Coefficients  are  plotted  versus  A.  A  vertical  line  is  drawn  at  A  =  0.9986. 


The  fitted  values  are  given  by  ft  =  S\y,  where  S\  =  X  (. X'X  +  A 2pD)  1  X' . 
From  this  point  forward,  we  will  drop  the  subscript  A  on  the  smoother  matrix  and 
just  write  S.  The  smoothing  parameter  A  is  unknown  but  can  be  specified  a  priori. 
However,  it  is  often  beneficial  to  let  the  data  determine  the  appropriate  amount  of 
smoothness.  To  this  end,  cross-validation  techniques  are  often  used  to  estimate  A 
from  the  given  data.  In  Chapter  5,  we  discuss  an  alternative  approach  to  P-splines 
that  automatically  selects  an  appropriate  amount  of  smoothness. 

2.4.1  Inference  for  linear  smoothers. 

Consider  the  general  heteroscedastic  error  model 

y>i  =  )  +  ei:  6i  ~  (0,  a2),  i  =  1, . . .  n, 

where  //(•)  is  an  unknown  smooth  function.  Let  fi(x)  be  an  estimate  of  //(•)  based  on 
a  linear  smoother.  The  covariance  matrix  of  the  vector  of  fitted  values  fi  =  Sy  is 

(2.16)  Var  {£}  =  S  (a2l)  S'  =  a2eSS'. 
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For  a  single  point,  the  quantity  Q  =  [jj,(x)  —  p,(x)]  / se  {fi(x)}  is  approximately 
pivotal — the  distribution  is  free  of  unknown  parameters,  at  least  for  sufficiently 
large  sample  size  n.  Therefore,  assuming  e  ~  Equation  (2.16)  can  be 

used  to  form  conEdence  intervals  and  prediction  intervals.  Note,  however,  that 
E  {/2}  =  X(X'X  +  X  2PD)~1X’Xy-  hence,  j2(x)  is  a  biased  estimator  of  /i(x). 
Unless  the  bias  is  negligible,  the  confidence  intervals  discussed  here  only  cover 
E{J2(x)}  with  100(1  —  a)%  confidence.  This  problem  is  remedied  for  P-splines  in 
Chapter  4  where  we  discuss  an  alternative  approach  using  mixed  model  methodology. 

Let  x  be  an  arbitrary  value  of  the  explanatory  variable.  Recall  that,  for  linear 
smoothers,  the  fitted  value  Jl(x)  can  be  written  as  iv'y,  a  linear  combination  of  the 
response  variables.  The  variance  of  J2(x)  is  just  Varjo/y}  =  cGa/ax  Given  an 
estimate  of  of  a/,  an  approximate  100(1  —  a)%  confidence  interval  for  /a(x)  is  given 
by 

±  h-Q/2,d/CV/w'w, 

where  df  =  n  —  2tr  ( S )  +  tr  ( SS ').  For  sufficiently  large  sample  size  n,  the  quantity 
ti-a/2,df  can  be  replaced  with  Zi_a/2,  the  1  —  a/2  quantile  of  a  standard  normal 
distribution.  Similarly,  A  100(1  —  a)%  prediction  interval  for  a  new  observation  is 

fi(x)  ±  tl-a/2,d/<U\/l  +  u'u. 

2.5  Linear  mixed  effects  models 

Mixed  effects  models  (henceforth  referred  to  as  just  mixed  models)  represent 
a  large  and  growing  area  of  statistics.  In  this  section,  we  only  summarize  the 
key  aspects  of  a  special  kind  of  mixed  model  called  the  linear  mixed-effects  model 
(LMM).  The  extension  to  nonlinear  mixed  effects  models  and  generalized  linear 
mixed  effects  models  is  discussed  in  Pinheiro  and  Bates  (2009)  and  McCulloch  et  al. 
(2008),  respectively.  Mixed  models  are  useful  for  describing  grouped  data  where 
observations  belonging  to  the  same  group  are  correlated.  One  way  of  accounting 
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for  such  correlation  is  by  introducing  random  effects  into  the  model  which  induces 
a  particular  correlation  structure  on  the  response  vector.  Different  random  effects 
structures  induce  different  correlation  structures  on  the  response. 

The  LMM  extends  the  basic  LM  (Equation  (2.2))  to 

(2.17)  y  =  Xp  +  Zot  +  e, 

where  X  and  Z  are  known  design  matrices,  (3  is  a  vector  of  fixed  effects,  a  is  a 
vector  of  random  effects  distributed  as  «  ~  J\f(0,G),  and  e  is  a  vector  of  random 
errors  distributed  as  e  Af(0,R).  Further,  it  is  assumed  that  the  random  effects 
and  errors  are  mutually  independent,  that  is,  ot  _LL  e. 

Estimating  the  fixed  effects  (3  is  rather  straightforward  and  does  not  require 
normality.  Note  that  y  =  X(3  +  £ ,  where  E{£}  =  0  and  Var  {8}  =  ZGZ'  +  R  =  V. 
Assuming  normality,  the  log-likelihood  (ignoring  constants)  is  given  by 

(2.18)  jf{p, e)  =  -f  log \v\  - 1  (y  -  X0Y V-1  (y  -  x/3) , 

where  6  is  a  vector  containing  the  unique  elements  of  V.  Equating  the  partial 
derivative  of  J^(/3,0),  with  respect  to  the  parameter  /3,  yields  the  so-called 
generalized  least  squares  (GLS)  estimator 

(2.19)  p  =  (x’v^xy1  x’v~ly. 

Notice,  however,  that  the  GLS  estimator — which  happens  to  be  the  BLUE  of  (3 — 
depends  on  the  variance-covariance  matrix  V .  Since  this  is  rarely  available  in  practice, 
the  usual  procedure  is  to  estimate  V  and  then  plug  this  into  Equation  (2.19).  In  other 
words,  if  V  is  an  estimate  of  V,  then  the  estimated  (or  empirical)  best  linear  unbiased 
estimator  (EBLUE)  of  (3  is 

(2.20)  p  =  (yx’v^x^j  1  x’v^y. 
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This  causes  some  difficulties,  mostly  in  terms  of  finite-sample  inference,  since  there 
is  no  simple  way  to  account  for  the  variability  of  V  when  calculating  Var  j/3  j.  See, 
for  example,  McCulloch  et  al.  (2008,  pp.  165-167). 

A  technique  known  as  best  linear  unbiased  prediction  is  commonly  used  to 
estimate  the  random  effects  ot.  ft  can  be  shown  (Henderson,  1973)  that  the  BLUE  of 
(3  and  the  best  linear  unbiased  predictor  (BLUP)  of  ot,  denoted  ot,  can  be  determined 
simultaneously  as  the  solutions  to  a  penalized  least  squares  problem, 


(2.21) 


0 


ot 


=  argmin  {y  -X(3-  Zot)'R~l(y  -Xf3-  Zot)  +  oiG xot 


13, a 


=  o(o'  RG  +  D)n'R -ly, 


where  f3  is  as  in  (2.19),  at  =  GZ’V^1  [y  —  X (3^ ,  ft  =  (X;Z),  and  D  = 
diag{0pxp,  G -1}.  For  the  special  case  R  =  a^I  and  G  =  cr^I,  the  penalized  sum  of 
squares  (PSS) — the  minimand  in  Equation  (2.21) —  reduces  to 


(2.22) 


±-\\y-X(3-ZoL\\2  +  \\\ot\\2 


(Ji 


cri 


Similar  to  the  EBLUE  of  (3,  the  estimated  (or  empirical)  best  linear  unbiased 
predictor  (EBLUP)  of  at  is  just  the  BLUP  at  with  G  and  V  replaced  with  their 
respective  estimates,  G  and  V : 

(2.23)  a  =  GZ'V~l  (y  -  X^j  . 

In  a  similar  fashion,  the  BLUP  and  EBLUP  of  the  mean  response  are  given, 
respectively,  by  the  equations 


(2.24) 

(2.25) 


/i  =  Xj3  +  Zot. 
jl  =  X(3  +  Zot. 


Note  that  the  EBLUP  /x  is  just  the  fitted  values. 
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Obviously,  estimating  (3  and  ct,  requires  knowledge  of  the  variance-covariance 
matrices  G  and  R ,  which  are  generally  unknown.  In  practice,  we  often  restrict  these 
matrices  to  have  a  simple  form,  usually  involving  only  a  few  unknown  parameters, 
earlier  denoted  by  0.  These  parameters  can  be  estimated  via  ML  estimation.  ML 
estimators  of  the  variance  components  9,  however,  tend  to  become  badly  biased  as  the 
number  of  fixed  effects  in  the  model  increases.  A  more  effective  approach,  known  as 
restricted  (or  residual)  maximum  likelihood  (REML)  estimation,  is  often  used  instead 
(see,  for  example,  McCulloch  et  al.  (2008,  chap.  6)). 
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III.  Overview  of  Statistical  Calibration 


In  this  chapter,  we  present  a  thorough  overview  of  calibration  as  it  pertains  to  our 
goals  for  this  dissertation.  Although  we  cannot  cover  every  aspect  of  calibration  in  this 
chapter,  we  have  attempted  to  provide  an  extensive  bibliography  for  the  interested 
reader.  The  main  themes  of  this  chapter  are  going  to  be  point  estimation  and  interval 
estimation. 

3.1  Controlled  Calibration  vs.  Natural  Calibration 

Calibration  experiments  can  be  classified  as  one  of  two  types:  controlled 
calibration  experiments  and  natural  calibration  experiments.  In  controlled  calibration, 
the  predictor  values  are  held  fixed  by  the  experimenter  (i.e. ,  x  is  not  considered 
a  random  variable).  In  this  case,  the  predictor  values  are  chosen  to  cover  the 
experimental  range  of  interest  and  the  response  is  often  replicated  a  number  of  times 
at  each  design  point  (see,  for  example,  the  arsenic  data  plotted  in  Figure  1.1).  In 
contrast,  the  n  observations  (a;*,  y,-)  in  a  natural  calibration  experiment  are  considered 
a  random  sample  from  some  bivariate  distribution:  (X,y)  ~  y(x,y),  where  g  is 
typically  assumed  to  be  a  bivariate  normal  distribution.  In  summary,  the  values  of  the 
independent  variable  are  either  preselected  or  held  fixed  in  controlled  calibration  and 
randomly  sampled  from  a  population  of  values  in  natural  calibration.  Distinguishing 
between  these  two  types  of  calibration  experiments  is  important,  especially  from  an 
inferential  standpoint,  as  emphasized  in  the  papers  by  Brown  (1982)  and  Brown 
(1993). 

3.2  Point  estimation 

Here,  we  discuss  point  estimation  of  xo  for  the  linear  calibration  problem,  that 
is,  when  the  calibration  curve  has  the  form  of  the  simple  linear  regression  model. 
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The  two  most  popular  estimators  of  x0  are  the  classical  estimator  and  the  inverse 
estimator.  The  classical  estimator,  which  dates  back  to  Eisenhart  (1939),  is  based 
on  inverting  the  calibration  curve  at  x/q  (i.e. ,  solving  the  fitted  regression  equation 
for  xo)  and  is  easily  extended  to  polynomial  and  nonlinear  calibration  problems.  The 
inverse  estimator,  as  we  will  see  in  Section  3.4,  is  more  useful  under  a  specific  Bayesian 
framework. 

3.2.1  The  classical  estimator. 

Suppose  we  have  n  observations  (xj,3d)-  We  assume  the  x^s  were  measured 
without  error.  (The  case  where  there  is  error  in  both  variables  is  discussed  in  Carroll 
and  Spiegelman  (1986).)  Generally,  the  model  considered  is  of  the  form 

(3.1)  3d  =  p  (xi,  (3)  +  6i,  i  —  1, . . .  ,n, 

where  p  =  E  {3^|x}  is  a  known  expectation  function,  (3  is  a  vector  of  p  unknown 
regression  parameters,  and  the  errors  are  independent  and  identically  distributed 
(i.i.d.)  normal  random  variables:  e*  ~  We  consider  the  linear  calibration 

problem,  a  special  case  of  Equation  (3.1)  with  E  {3^}  =  /30  +  (3\X. 

The  fitted  calibration  line  is  given  as 

(3.2)  p  =  fto  +  /3ix, 

where  /3q  and  /3i  are  the  ML  estimates  of  /30  and  j3i,  respectively.  If  we  observe 
33)  =  Vo,  where  33)  rs-/  W(/30  +  /3iXo,a^),  then  the  obvious  estimate  of  x0  is  obtained 
by  inverting  the  calibration  line: 

(3-3)  £0  =  h~\yo]  (3)  =  Vo  a  =  x  +  ^(2/0  -  y), 

Pi 

where  Sxy  =  ~  x)(Vi  ~  V)i  and  Sxx  =  XY-  Under  the  assumption  of  i.i.d. 

normal  errors,  Equation  (3.3)  is  the  ML  estimate  of  xq.  More  generally,  suppose  in 
addition  to  the  standards 

(ah,  3d),  (x2, 3^2),...,  (xn,  yn), 
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we  have  m  unknowns 


y 


(x0,  3^n+l);  (x0,  (Vn+2))  •  ■  ■  j  (X()j  3A+m)- 

We  assume  that  the  +’s  (i  —  1, . . .  ,n)  are  independent  and  distributed  according  to 

N{0o  T+x^of),  i  — 

N(0o  +  /3ix0,  erf),  i  =  n  +  1,  n  +  2, . . . ,  n  +  m 
In  practice,  assuming  that  of  =  erf  is  often  reasonable;  however,  some  authors  (e.g., 
Berkson  (1969,  p.  659)),  have  argued  otherwise.  In  controlled  calibration,  one  could 
argue  that  the  variance  in  the  first  stage,  of,  may  be  smaller  than  the  variance  from 
the  second  stage,  erf,  since  the  standards  were  likely  collected  under  more  highly 
controlled  conditions.  For  example,  the  standards  in  the  arsenic  data  may  have  been 
collected  in  a  laboratory  under  tightly  controlled  conditions,  while  the  measurement 
made  on  the  new  sample  was  likely  made  in  the  field  (e.g.,  a  lake)  and  therefore 
susceptible  to  greater  measurement  error. 

The  log-likelihood  for  all  n  +  m  observations  is 

fl  _L  TT1 

&  (A),  A,  of ,  X0)  =  -  log(27Tof  )  — 


1 

2<7? 


n+ra 


J^(+  ~0o~  0\Xi)2  +  Yt  (y<  -  0o  —  0iXOf 


i=  1 


i=n+ 1 


Equating  the  partial  derivatives  of  (0o,  0i,  of ,  xq)  to  zero  and  solving  for  the 
parameters  produces  the  usual  ML  estimators  of  the  slope  and  intercept,  but  also 
yields 

y0-p0 


(3.4) 

(3.5) 


x0  = 


0i 


+  = 


n  +  m  —  3 


2=1 


n+m 


yf(yi-0o-0ixiy+  y  (w-^0)2 


2=72+1 


where  y®  =  m  1  Y^i=^+  i  +  •  Equation  (3.4)  is  known  as  the  classical  estimator  of  xq. 


—1 v^n+m 
Z_-/2=n+( 

The  mean  of  xq  does  not  exist,  and  it  has  infinite  variance  and  mean  squared  error 
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(MSE).  This  is  not  too  surprising  since  Equation  (3.4)  is  a  ratio  of  jointly  normal 
random  variables  (recall  that  a  standard  Cauchy  distribution,  which  does  not  have 
any  finite  moments,  results  from  the  ratio  of  standard  normal  random  variables). 
Also,  note  that  the  pooled  estimate  of,  Equation  (3.5),  is  a  weighted  average  of  the 
estimates  of  of  from  the  first  and  second  stages  of  the  calibration  experiment. 

The  sampling  distribution  of  xG  is  quite  complicated.  Fortunately,  its  derivation 
is  not  necessary  for  setting  a  100(1  —  a)%  confidence  interval  on  xq  (see  Section  3.3). 
Nonetheless,  the  resulting  distribution  has  been  studied  by  Fieller  (1932),  Hinkley 
(1969),  Buonaccorsi  (1986),  and  Pham-Gia  et  al.  (2006),  among  others.  The  paper 
by  Pham-Gia  et  al.  (2006)  gives  a  closed-form  expression  for  the  density  of  the  ratio 
of  jointly  normal  random  variables.  Buonaccorsi  (1986)  showed  that  a  sufficient 
condition  for  unimodality  of  the  sampling  distribution  of  Xq  is 

(I0  _„2<  5.094  (|-) 2  (i  +  l). 

A  similar  result  also  holds  for  the  posterior  of  Xo  in  the  Hunter  and  Lamboy  (1981) 
approach  to  Bayesian  linear  calibration  (see  Section  3.4).  Unimodality  here  is 
important  since  some  confidence  intervals  (e.g.  the  Wald  interval  of  Section  3.3.2)  are 
derived  under  the  assumption  that  the  sampling  distribution  of  Xq  is  asymptotically 
normal. 

3.2.2  The  inverse  estimator. 

The  classical  estimator  involves  regressing  y  on  x  and  solving  the  fitted  regression 
equation  for  the  unknown  xq.  The  inverse  estimator,  however,  uses  the  regression  of 
x  on  y  to  obtain 

S 

(3.6)  xG  =  70  +  7W0  =  x  +  -^-(yo  -y), 

*-5 yy 
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where  70  and  71  are  least  squares  (LS)  estimates,  and  Syy  =  YKl/i  ~  v)2-  After  a  bit 
of  algebra,  Equation  (3.6)  can  be  re-written  as 

(3.7)  x0  =  (l-  \  x  +  xy  x0  =  (l  -  R2)  x  +  R2x0 , 

\  ^xx^yy  J  ^xx^yy 

where  R 2  is  the  coefficient  of  determination  computed  from  (3.2).  This  shows  x0  as 
a  weighted  average  of  the  ML  estimator  xo  and  x;  hence,  the  inverse  estimator  takes 
into  account  previous  information  about  Xq  (this  is  relevant  to  Section  3.4  where  the 
inverse  estimator  is  shown  to  be  Bayes  with  respect  to  a  certain  prior  on  xo).  Also, 
when  the  error  variance  is  zero,  R2  =  1  and  Xq  =  Xo- 

Inference  based  on  this  approach,  at  least  from  the  frequentist  perspective,  is 
justifiable  only  if  the  observations  (xi,j/i)  are  sampled  from  a  bivariate  distribution 
(i.e. ,  a  natural  calibration  experiment).  In  controlled  calibration  experiments,  x  is 
held  fixed  and  the  classical  estimator  is  mostly  preferred.  Nonetheless,  much  effort 
has  gone  into  justifying  the  use  of  the  inverse  estimator  in  controlled  calibration  (see, 
for  example,  Krutchkoff  (1967)  and  more  recently  Kannan  et  al.  (2007)). 

3.2.3  Criticisms  and  other  estimators. 

Using  extensive  Monte  Carlo  simulations,  Krutchkoff  (1967)  argued  that  the 
inverse  estimator  (3.6)  had  a  uniformly  smaller  Mean  squared  error  (MSE)  than  the 
classical  estimator  (3.3).  His  experiments  considered  both  normal  and  non-normal 
error  distributions.  This  sparked  controversy  in  the  statistical  community  resulting  in 
a  renewed  interest  in  the  inverse  estimator  for  controlled  calibration.  In  a  later  paper 
(Krutchkoff,  1969),  Krutchkoff  concluded  that  the  classical  approach  is  superior  for 
large  sample  sizes  when  extrapolating  beyond  the  range  of  observations. 

When  the  error  distribution  is  normal  and  n  >  4,  the  inverse  estimator  has 
finite  MSE  (Oman,  1985);  whereas  the  classical  estimator  has  infinite  MSE  for 
finite  n  (Williams,  1969).  Furthermore,  Williams  (1969)  argued  that  MSE  is  an 
inappropriate  criterion  for  comparing  the  two  estimators  by  establishing  that  “...no 
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unbiased  estimator  [of  Xq]  will  have  finite  variance.”  Halperin  (1970)  agreed  with 
Williams  and  suggested  the  use  of  the  Pitman  closeness  criterion  (Pitman  Hobart 
1937;  Mood  et  al.  1974,  pg.  290)  instead.  For  a  parameter  0  with  parameter  space 
$,  an  estimator  Ti  is  said  to  be  Pitman  closer  (to  </>)  than  another  estimator  T2  if  for 
all  </>  G  <E>, 

(3.8)  Pr0(|Ti  -0|  <  \T2  -0|)  >  0.5. 

Unlike  the  MSE  criterion,  Pitman  closeness  takes  into  consideration  the  correlation 
between  the  two  estimators  being  compared  (here,  they  are  perfectly  correlated). 
With  respect  to  the  Pitman  closeness  criterion,  Halperin  claimed  that  the  classical 
estimator  is  superior  both  outside  and  within  the  range  of  predictor  values;  though, 
Halperin’s  conclusions  were  based  on  asymptotic  approximations.  In  addition, 
Halperin  also  compared  the  two  estimators  in  terms  of  consistency  and  MSE  of  the 
relevant  asymptotic  distributions,  showing  that  Krutchkoff’s  findings  were  only  true 
for  a  closed  interval  around  x.  The  width  of  this  interval  depends  on  the  product  of 
the  standardized  slope  and  standard  deviation  of  the  predictor  values,  ax.  Halperin 
also  preferred  the  classical  estimator  on  the  basis  that  it  yields  an  exact  100(1  —  a)% 
confidence  region  for  x0  (see  Section  3.3.1).  In  response,  Krutchkoff  (1972)  used 
the  Pitman  closeness  criterion  in  Monte  Carlo  simulations  and  obtained  the  opposite 
results;  that  is,  the  inverse  estimator  was  overall  superior  to  the  classical  estimator. 
The  contradiction  seems  to  stem  from  the  range  of  predictor  values  considered  by 
both  authors. 

The  classical  and  inverse  estimators  cannot  be  compared  through  exact  moments; 
however,  one  can  easily  examine  their  asymptotic  properties.  Berkson  (1969) 
established  that  the  classical  estimator  is  asymptotically  unbiased  while  the  inverse 
estimator  is  not.  He  derives  formulas  for  the  asymptotic  bias,  variance,  and  MSE  of 
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both  estimators.  Shukla  (1972)  extended  these  formulas  to  an  accuracy  of  0{n  1): 


I 

fj'J  m  n 


^  SXx  triSxxP] 


n  Sxx  mSxxl3l6 2 

2cr4(a:o  —  x )2 


1  (x0  -  x )2  a2e{62  -  29  +  6) 


2a2  (x  -  x0) 
nfilalQ3 


where  a2  =  Sxx/{n  —  1)  and  0  =  1  +  <72/(/3i<jx)2.  Lwin  (1981)  provided  a  further 
extension  by  allowing  the  error  distribution  to  be  any  member  of  the  location-scale 
family  of  distributions.  To  an  accuracy  of  0(n-1),  Lwin  showed  that  the  asymptotic 
MSE  of  the  classical  estimator  for  any  error  distribution  from  the  location-scale  family 
is  the  same  as  that  obtained  by  Shukla  (1972).  The  same  is  not  true  for  the  inverse 
estimator  whose  MSE  to  the  same  order  is  affected  by  both  the  skewness  and  kurtosis 
of  the  error  distribution. 

Kannan  et  al.  (2007)  obtained  more  accurate  results  supporting  the  use  of  the 
inverse  estimator,  ffowever,  they  found  that  when  |/3i/cr|  is  moderate  to  large,  the 
classical  method  is  preferred  in  the  sense  of  Pitman  closeness.  Ali  and  Ashkar  (2002) 
discussed  the  impact  of  the  coefficient  of  determination  and  proposed  an  estimator 
based  on  the  midpoint  of  the  inversion  interval  for  x0  (see  Section  3.3.1).  Lwin  and 
Maritz  (1982)  take  a  compound  estimation  approach  to  the  linear  calibration  problem. 
They  discuss  the  merits  of  both  estimators  and  provide  further  justification  for  each 
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without  reference  to  specific  distributional  assumptions.  Lwin  and  Maritz  concluded 
that  the  classical  estimator  is  preferred  only  if  the  condition  of  asymptotic  unbiased  is 
imposed.  A  likelihood  analysis  for  the  calibration  problem  was  carried  out  by  Minder 
and  Whitney  (1975). 

A  number  of  other  estimators  have  been  proposed  in  the  literature;  though, 
none  have  received  the  level  of  attention  as  the  classical  and  inverse  estimators.  Ah 
and  Singh  (1981),  for  example,  used  a  weighted  average  of  the  classical  estimator 
Xo  and  x.  The  idea  is  to  shrink  the  estimate  toward  x  when  x$  is  near  the  center 
of  the  data.  Using  small-disturbance  asymptotic  approximations,  Srivastava  and 
Singh  (1989)  derived  an  estimator  that  is  a  weighted  average  of  the  classical  and 
inverse  estimators:  £  =  [x0  +  (n  —  3)x0]  /{n  ~  2).  This  estimator  gives  more  weight 
to  the  inverse  estimator  Xq  when  the  sample  size  is  large  and  vice  versa.  In  terms  of 
asymptotic  bias,  £  is  superior  to  the  classical  and  inverse  estimators.  Naszdi  (1978) 
proposed  an  estimator  that  is  approximately  (asymptotically)  unbiased,  consistent, 
and  more  efficient  than  the  classical  estimator.  Dahiya  and  McKeon  (1991)  obtain  a 
confidence  interval  for  xo  based  on  the  estimator  in  Naszdi  (1978). 

Many  papers  comparing  the  classical  and  inverse  estimators  have  been  published, 
none  of  which  offer  a  definitive  answer  on  which  estimator  is  best.  If  point  estimation 
is  the  only  concern  (which  is  rarely  the  case),  both  estimators  are  of  value.  On 
the  other  hand,  if  inference  is  to  be  made  regarding  xq,  the  classical  estimator  is 
preferred  for  controlled  calibration  experiments  and  the  inverse  estimator  for  natural 
calibration  experiments.  Of  course,  if  suitable  prior  information  can  be  assembled, 
then  a  Bayesian  approach  may  be  the  best  alternative  in  either  case,  especially  if  the 
calibration  curve  is  nonlinear  (see  Section  3.4). 

Finally,  note  that  most  of  the  previous  discussion  was  in  reference  to  the  linear 
calibration  problem  with  homoscedastic  normal  errors.  The  inverse  estimator  is  less 
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appropriate  for  nonlinear  calibration  curves,  especially  when  the  calibration  curve  has 
horizontal  asymptotes  (Jones  and  Rocke,  1999) 

3.2.4  Arsenic  example. 

Here  we  illustrate  the  use  of  the  classical  and  inverse  estimators  on  the  arsenic 
data,  for  which  the  simple  linear  regression  model  with  homoscedastic  normal 
errors  seems  appropriate.  This  is  a  controlled  calibration  experiment  since  the  true 
concentrations  of  arsenic  were  preselected  by  the  experimenter.  We  wish  to  estimate 
the  true  concentration  of  arsenic  in  a  new  sample  based  on  the  new  observation  y 0  =  3 
/ig/ ml.  Figure  3.1  shows  a  scatterplot  of  the  standards  with  the  fitted  calibration  line. 
The  ML  estimate  of  Xo  is  2.941  /ig/ ml  while  the  inverse  estimate  is  2.945  /ig/ml,  a 
difference  of  only  0.004.  In  practice,  the  two  estimators  will  not  be  much  different 
when  the  coefficient  of  determination,  R2,  is  close  to  one.  For  the  arsenic  data, 
R2  =  0.9936  (recall  Equation  3.6). 
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Actual  (|ig/ml) 

Figure  3.1:  Fitted  calibration  line  for  the  arsenic  data.  The  horizontal  arrow  indicates 
the  position  of  the  observed  response  y0  =  3  jig/ ml  and  the  vertical  arrow  indicates 
the  position  of  the  ML  estimate  xq  =  2.941. 


3.3  Confidence  intervals 

Much  effort  has  been  put  into  deriving  and  comparing  point  estimators  for 
xq.  Without  some  measure  of  precision,  however,  a  point  estimate  is  practically 
useless.  In  this  section,  we  discuss  construction  of  100(1  —  a)%  confidence  intervals 
for  xq,  also  known  as  calibration  intervals.  There  are  two  methods  commonly  used 
for  calculating  calibration  intervals  (Zeng  and  Davidian,  1997b):  inversion  intervals 
and  Wald  intervals ,  additionally,  we  also  discuss  bootstrap  calibration  intervals.  A 
discussion  on  Bayesian  credible  intervals  is  deferred  until  Section  3.4.  For  the  most 
part,  we  assume  the  regression  model  (3.1)  with  normal  errors  and  constant  variance 
is  appropriate  and  that  the  predictor  values  are  fixed  by  design. 

28 


3.3.1  Inversion  interval. 

Similar  to  inverting  the  calibration  curve  to  obtain  a  point  estimate,  a  confidence 
interval  for  xq  can  be  constructed  by  inverting  a  prediction  interval  for  the  response. 
We  refer  to  this  type  of  calibration  interval  as  the  inversion  interval.  The  inversion 
interval  relies  on  the  distribution  of  the  predictive  pivot ,  Q,  which  for  the  linear 
calibration  problem  is  given  by 


(3,9) 


Q  = 


3^0  -  A)  -  Pixq 


at 


j_  I  1  I  (■ XQ-x )2 

m  '  n  '  Sxx 


It  can  be  shown  (Graybill,  1976)  that  Q  ~  T  [n  +  m  —  3),  hence, 


Pr  [ta/2,n+m—3  ^  Q  ^  1 1_ a/2,n+m—  3)  1 


a. 


Squaring  both  sides  of  the  inequality  and  expanding,  we  obtain  a  simple  quadratic  in 
Xq : 


(3.10) 

where 


Pr  [ax 0  +  bx 0  +  c  <  0)  =1 


a, 


a  =  P(-  aft2 /S. 


6  =  2 


c  = 


xa2t 2 

^  XX 


-  ft  (fto  -  ft)  -  3? 


y0-p0)  - o\t2  (—  +  -  + 


t  =  1 1 


1— a/2,n+ra— 3* 


An  exact  100(1  —  a)%  confidence  interval  for  xo  is  given  by  the  set 
(3.11)  J/cai  (a?)  =  {A  :  +  bx  +  c  <  0  j  . 

Although  we  use  the  term  confidence  interval  here,  it  is  possible  that  the  values 
of  x  that  satisfy  this  inequality  do  not  form  an  actual  interval.  In  particular,  four 
possibilities  exist: 
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(i)  the  set  is  a  finite  interval,  Jce\  (x)  =  (L,  U ); 


(ii)  the  set  is  the  entire  real  line,  Jca\  (x)  =  (—00,  00) ; 

(iii)  the  set  consists  of  two  semi- infinite  intervals,  Jca\  (x)  =  (—00,  U)  U  (L,  00); 


(iv)  the  set  is  empty,  Jc&\  (x)  =  0. 


The  circumstances  leading  to  (i)-(iv)  are  depicted  in  Figure  3.2.  A  finite  interval 
(i)  will  occur  if  and  only  if  a  >  0  and  b2  —  4 ac  >  0.  Upon  closer  inspection  of  the 
quadratic  in  Equation  (3.10),  we  see  that  this  occurs  when  a  >  0  or  rather  when 
(32 / (a2/ Sxx)  >  t2.  In  other  words,  when  the  slope  f3\  is  significantly  different  from 
zero  at  the  specified  a  level  (i.e. ,  the  regression  line  is  not  too  flat),  the  solution  to 
Equation  (3.11)  forms  a  100(1  —  a)%  confidence  interval  for  Xq  and  is  given  by 


(3.12) 


-  +  ft(S'-S)±gx 

a  a 


teo-s)! 

m  n  J  bTT 


or  equivalently 


Xq  + 


(x0  -  x)g  ± 


\J (^o  -  x)2/Sxx  +  (1  -  g)  (£  +  J) 

1  ~9 


where  g  =  (t2a2)  /(/3fSxx).  The  dependence  on  a  statistically  significant  slope  is 
generally  not  regarded  as  a  concern  since  “any  self-respecting  calibrator  will  design 
a  calibration  experiment  such  that  his  or  her  instrument  is  expected  to  have  a 
statistically  significant  slope  ...”  (Brown,  1993,  pp.  25).  Hoadley  (1970)  noted 
that  the  width  of  interval  (3.12)  depends  on  the  value  of  the  observed  test  statistic, 
T *  =  /3fSxx/a2  =  t2,  for  testing  T-L0  :  /3i  =  0  versus  T-L\  :  /3i  ^  0.  A  large  value  of 
T*  is  associated  with  a  smaller  interval  and  vice  versa.  An  approximation  can  be 
obtained  by  setting  g  —  0  in  Equation  (3.12),  which  gives 

(3.13)  ±  (A)  J-  +  -  + 

V/fi/  V 171  n  Sxx 
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This  approximation  is  useful  when  g  is  small,  say  g  <  0.05  (Draper  and  Smith,  1998). 
Fieller’s  method  (Fieller,  1954),  which  applies  to  the  ratio  of  normally  distributed 
random  variables,  can  also  be  used  to  derive  interval  (3.12)  using  a  fiducial  argument. 
Extending  the  inversion  interval  to  the  case  of  multiple  predictors  is  discussed  in 
Draper  and  Smith  (1998,  pg.  229)  and  in  Brown  (1993,  chap.  3).  Brown  considers 
the  special  case  of  polynomial  regression. 


Figure  3.2:  Solutions  to  the  equation  ax2  +  bx  +  c  <  0.  Top  left:  The  set  is  an 
interval.  Top  right :  The  set  is  empty.  Bottom  left:  The  set  consists  of  two  semi¬ 
infinite  intervals.  Bottom  right:  The  set  is  the  entire  real  line. 
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For  nonlinear  calibration  curves, 


(3.14) 


Q 


yo  —  ^ 


d\jm  +  Var  l  y  (x0]  (3 


is  only  an  approximate  pivot.  We  assume  Q  ~  A7(0, 1)  as  n  goes  to  oo.  An 
approximate  100(1  —  a)%  confidence  interval  for  x0  based  on  Equation  (3.14)  is  the 
set 


(3.15)  Jcai(x)  =  <  x  :  za/ 2  < 


Vo  - //  [x:(3 


a/ /m  +  Var  <  y  ( x]  (3 


<  Z l-a/2  /  j 


where  za/2  and  Z\-a/2  are  the  a/2  and  1  —  a/2  quantiles  of  the  standard  normal 
distribution,  respectively.  To  be  more  conservative,  we  can  replace  the  normal 
quantiles  with  those  of  a  T  (n  +  m  —  p  —  1)  distribution  (p  being  the  dimension  (3 ). 
Unlike  the  linear  calibration  problem,  the  solution  to  Equation  (3.15)  cannot  be 
written  in  closed-form  and  will  require  iterative  techniques. 

For  the  special  case  m  =  1,  the  inversion  interval  is  equivalent  to  drawing  a 
horizontal  line  through  the  scatterplot  of  the  standards  at  yo  and  finding  the  abscissas 
of  its  intersection  with  the  100(1  —  a)%  (pointwise)  prediction  band  of  the  calibration 
curve.  For  the  straight  line  case,  if  /3i  is  not  significantly  different  from  zero,  the 
regression  line  is  not  well  determined  and  the  horizontal  line  drawn  at  yo  will  not 
intersect  the  prediction  band  at  two  points,  leading  to  cases  (ii)  or  (iii)  outlined  above; 
this  point  is  illustrated  in  Figure  3.3.  The  issue  in  linear  calibration  is  that  prediction 
bands  are  really  hyperbolas  which,  depending  on  the  quality  of  the  model/data,  can 
bend  quite  severely.  To  circumvent  this  problem,  Trout  and  Swallow  (1979)  proposed 
a  similar  procedure  based  on  uniform  prediction  bands  (i.e.,  prediction  bands  that 
are  parallel  to  the  calibration  line).  This  has  the  advantage  of  always  producing  a 
(symmetric)  confidence  interval  for  xq,  without  sacrificing  efficiency. 
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Figure  3.3:  Common  prediction  band  shapes.  Left:  Horizontal  line  at  t/0  intersects  the 
prediction  band  at  two  points,  resulting  in  a  finite  interval.  Middle :  Horizontal  line  at 
7/0  does  not  intersect  the  prediction  band  at  all  resulting  in  an  infinite  interval.  Right: 
Horizontal  line  at  yo  only  intersects  with  one  side  of  the  prediction  band  resulting  in 
two  semi-infinite  intervals. 

Similarly,  we  can  compute  the  inversion  interval  for  nonlinear  calibration 
problems  by  drawing  the  prediction  band,  however,  inference  in  nonlinear  regression 
often  relies  on  linear  approximations,  large  samples,  and  approximate  normality.  For 
nonlinear  calibration  curves,  the  bootstrap  (see  Section  3.3.3)  may  provide  more 
accurate  results.  Drawing  the  prediction  band  for  calibration  curves  also  gives  an  idea 
as  to  what  values  of  yo  lead  to  meaningful  interval  estimates  for  xq.  For  example,  an 
observed  value  y0  too  close  to  a  horizontal  asymptote  will  produce  a  useless  confidence 
interval  for  xo  (if  at  all).  Graphical  methods  like  this  are  discussed  by  Jones  and  Lyons 
(2009)  who  extend  the  approach  for  longitudinal  data  with  bivariate  response. 

3.3.2  Wald  interval. 

Another  common  approach  for  obtaining  a  confidence  interval  for  the  unknown 
Xo  is  to  use  the  delta  method  (VerHoef,  2012;  Dorfman,  1938),  also  see  Casella  and 
Berger  (2002).  Let  the  variance-covariance  matrix  of  be  given  by  E  where 

Cov  jjd),  /3  j 
Var  j/3  j 


Var{y0} 
Cov  {y0,p} 
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Recall  that  (3  is  a  linear  function  of  the  observations  y.  Since  3d)  is  independent 
of  y,  it  follows  that  3d)  and  {3  are  also  independent;  hence,  Cov  jddb/^j  =  0? 
Therefore,  E  simplihcs  to 


JpXp • 


E  = 


a,/ rn  0pXp 

Opxp  ae2(X'X) 


The  classical  estimator,  x0,  has  the  form  x  =  p~l  ( y,  (3).  Let  (r/;  (3)  =  [y]  (3) 

and  fio1  (y,  (3)  =  jpP^1  ( y]  (3 ).  Note  that  if  p  (the  dimension  of  (3)  is  greater  than 
one,  then  /Lj"1  (y,  (3)  will  be  a  vector  valued  function.  The  delta  method  estimate  of 
the  variance  of  x0  =  p^1  ^3d>;/3^,  based  on  a  first-order  Taylor  series  expansion,  is 
given  by 


2  r 


(3.16)  Var{x0}  =  —  p11(y0;/3 
m 


1  2 


+  <7e 


H2l[y o;/3  (X'X)-1  n?  (y0;(3 


For  the  simple  linear  calibration  problem,  Equation  (3.16)  reduces  to 


<7. 


2  r 


Var{x0}  = 

'  Pi  L 


1  1  (x0  —  X 

- 1 - b  — - : 

m  n  Srr 


.2  n 


The  estimated  standard  error  (se)  of  xo  is  just  se{xo}  =  yVar{x0}- 
Assuming  that 


(3.17)  W  =  1°  ~  A7(0, 1) 

se{x0) 

for  “large”  n  leads  to  an  approximate  100(1  —  a)%  Wald-based  conhdence  interval 
for  x0  of 


(3.18)  Xo  i  ^1— a/2,n+m— p—  1  '  Se  {To}  , 

where  ti_Q/2,n+m-p-i  is  the  1  —  a/2  quantile  of  a  Student’s  ^-distribution  with 
n  +  m  —  p  —  1  degrees  of  freedom.  If  the  sample  size  is  large  enough,  we  could 
replace  tl  —  a/2n  +  m  —  p  —  1  with  zi_q/2,  the  1  —  a/2  quantile  of  a  standard 
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normal  distribution.  Unlike  the  inversion  interval,  Equation  (3.15),  the  Wald-based 
interval  always  exists  and  is  symmetric  about  the  point  estimate  xq.  This  approach 
is  useful  when  it  is  difficult  (or  impossible)  to  invert  a  corresponding  prediction 
interval.  The  symmetry  of  the  Wald  interval,  however,  may  be  unrealistic  in  nonlinear 
calibration  problems  when,  for  example,  yo  is  near  a  horizontal  asymptote  (Schwenke 
and  Millikem,  1991).  Perhaps  the  biggest  drawback  is  the  approximate  normality 
assumption  for  W,  which  is  not  always  reasonable  in  practice.  For  example,  in  simple 
linear  calibration,  the  distribution  of  xq  may  not  even  be  unimodal  (Buonaccorsi, 
1986).  On  the  other  hand,  for  the  simple  linear  calibration  problem,  the  Wald-based 
interval  is  equivalent  to  the  approximate  inversion  interval  given  in  Equation  (3.13). 
Thus,  provided  g  is  “small,”  the  Wald  interval  may  perform  well  even  when  W  is  not 
asymptotically  normal. 

Schwenke  and  Millikem  (1991)  discuss  the  inversion  and  Wald  confidence  intervals 
for  nonlinear  calibration.  Using  simulation,  they  showed  that  the  inversion  and  Wald 
intervals  can  attain  the  desired  confidence  level  in  samples  of  size  20  for  a  nonlinear 
exponential  decay  model.  They  also  discuss  testing  the  equality  of  two  calibration 
points.  However,  Schwenke  and  Millikem  treat  3^o  as  a  fixed  constant.  In  regulation- 
type  problems,  this  is  the  correct  approach.  However,  in  calibration,  the  confidence 
interval  for  x0  wifi  be  too  narrow.  To  illustrate,  consider  the  conventional  treatment 
group  of  the  postmortem  data  analyzed  in  Schwenke  and  Millikem  (1991).  In  essence, 
Schwenke  and  Millikem  computed  the  time  corresponding  to  a  mean  pH  level  of  6.0, 
not  an  observed  pH  level  of  6.0.  They  list  a  95%  Wald-based  confidence  interval  for  x0 
of  (3.315,4.317).  Accounting  for  the  correct  variation,  however,  this  interval  should 
actually  be  (2.615,5.016).  The  standard  error  they  computed  for  x0  is  based  on 
T  =  (/30,  /A,  P2)'  whereas  here  it  is  based  on  T  =  ((Tea  An  /A,  fa)' ■  In  short,  Schwenke 
and  Millikem  only  account  for  the  variance  of  the  estimated  regression  parameters 
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whereas  in  a  true  calibration  problem,  the  variance  of  3^o  needs  to  be  taken  into 
account. 

There  is  an  interesting  relationship  in  the  linear  calibration  problem  (with  m  —  1) 
between  the  Wald  interval  for  Xo,  interval  (3.18),  and  a  prediction  interval  for  (To-  Let 
Lw  and  Uw  denote  the  lower  and  upper  bounds,  respectively,  from  a  100(1  —  a)%  Wald 
confidence  interval  for  x0  corresponding  to  an  observed  y0.  If  x0  is  assumed  given,  then 
it  is  easy  to  show  that  a  100(1  — a) %  prediction  interval  for  is  ( f3o+f3iLw ,  /3q+/3iUw ) 
if  /3i  >  0  and  (/?o  +  (3iUw,f3o  +  f3iLw)  if  /%  <  0.  This  relationship  is  illustrated  in 
Figure  3.4  for  the  arsenic  example. 


Figure  3.4:  Scatterplot  of  the  arsenic  data  with  fitted  calibration  line.  The  horizontal 
arrow  represents  the  response  measurement  for  the  new  sample  and  the  vertical  arrow 
represents  the  ML  estimate  of  the  unknown  concentration.  The  vertical  gray  band 
represents  a  95%  Wald  confidence  interval  for  x0  corresponding  to  y0  =  3  and  the 
horizontal  gray  band  represents  a  95%  prediction  interval  for  (To  corresponding  to 
x0  =  x0. 
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3.3.3  Bootstrap  intervals. 

In  many  applications,  the  calibration  curve  is  usually  nonlinear  (e.g.,  dose- 
response  curves  and  other  types  of  assay  data).  Although  the  inversion  and  Wald 
procedures  can  still  be  applied,  they  can  be  highly  inaccurate  when  the  sample  size  is 
small.  A  useful  alternative  in  these  situations  is  to  use  the  nonparametric  bootstrap 
(Efron,  1979),  which  is  a  computer  intensive  technique  based  on  sampling  with 
replacement  from  the  observed  data.  As  such,  the  bootstrap  provides  an  alternative 
means  to  computing  bias,  standard  errors,  and  confidence  intervals.  Unlike  the  delta 
method,  however,  which  is  only  first-order  accurate,  the  bootstrap  can  often  have 
“second-order”  accuracy  (Casella  and  Berger,  2002,  pg.  517).  The  bootstrap  is  not 
without  assumptions  (e.g.,  independent  observations  are  still  required),  but  it  does 
allow  us  to  relax  the  usual  assumptions  of  large  sample  size  and  normality.  The 
supporting  mathematics  can  be  quite  involved,  but  the  interested  reader  is  pointed 
to  Efron  and  Tibshirani  (1994)  and  Hall  (1992).  A  detailed  and  practical  guide  to 
the  bootstrap  is  given  by  Davison  and  Hinkley  (1997). 

Let  Jii  =  p  (xp,  (3^  be  the  fitted  values  and  f3  be  the  least  squares  estimate  of  f3. 
The  two  (nonparametric)  bootstrap  resampling  schemes  for  regression  are: 

case  resampling:  pairs  of  data  are  sampled  with  replacement  to  produce 

(•Dj  Vl)  )  •  •  •  )  (A)  Vn)  1 

model-based  resampling:  the  residuals  e\, . . . ,  en,  or  a  modified  version 

thereof,  are  sampled  with  replacement  and  added  to  the  fitted  values  to  produce 
(x1,p1  +  e*1),...,(xn,pn  +  e*n). 

Efron  and  Tibshirani  (1994)  and  Davison  and  Hinkley  (1997)  discuss  the  merits  of 
both  schemes.  Model-based  resampling  is  more  appropriate  when  the  values  of  the 
independent  variable  are  fixed  by  design  (e.g.,  controlled  calibration  experiments) 
and  the  errors  have  constant  variance.  Resampling  cases,  on  the  other  hand, 
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is  less  accurate  but  more  robust  to  violations  of  the  model  assumptions  such  as 
homoscedastistic  errors. 

Various  bootstrap  approaches  to  calibration  have  been  suggested  in  the  literature. 
Rosen  and  Cohen  (1995b)  discuss  controlled  calibration  based  on  a  parametric 
bootstrap  where,  instead  of  sampling  directly  from  the  residuals,  samples  are  drawn 
from  a  normal  distribution.  This  procedure  will  work  even  when  the  calibration  curve 
is  estimated  nonparametrically  (e.g.,  cubic  smoothing  splines ).  Zeng  and  Davidian 
(1997a)  commented  that  the  procedure  proposed  by  Rosen  and  Cohen  requires  a  large 
number  of  resamples  to  achieve  the  desired  accuracy.  They  suggested  a  bootstrap 
adjustment  to  the  inversion  and  Wald  intervals  that  requires  far  fewer  bootstrap 
resamples.  Given  the  speed  and  multicore  functionality  of  modern  computers,  a  large 
number  of  resamples,  say  10,000  or  more,  is  less  of  a  concern. 

In  this  section,  we  outline  a  general  bootstrap  approach  to  controlled  calibration 
in  Algorithm  1,  but  first  we  discuss  a  rather  naive  approach.  For  an  observed  t/0, 
suppose  we  compute  R  bootstrap  replicates  of  xq,  and  then  use,  for  example,  the 
sample  a/2  and  1  —  a/2  quantiles  as  a  100(1  —  a)%  confidence  interval  for  xq.  This 
results  in  y0  being  treated  as  a  fixed  parameter  in  the  bootstrap  simulation,  but  in 
fact,  l/o  is  an  observed  value  of  the  random  variable  Vo  which  has  variance  a//m.  In 
other  words,  some  variation  is  not  getting  accounted  for  in  this  approach.  This  is 
akin  to  ignoring  Var  {Vo}  in  the  delta  method  discussed  previously.  One  solution  is 
to  simulate  the  correct  variance  by  adding  a  small  amount  of  noise  to  the  observed 
responses  from  the  second  stage  of  the  calibration  experiment. 

A  few  issues  regarding  the  residuals  in  Algorithm  1  are  worth  considering.  For 
the  linear  case,  it  is  preferable  to  scale  the  residuals  before  centering  (Davison  and 
Hinkley,  1997).  In  particular,  compute  r%  =  e,/V  1  —  ha,  where  ha  are  the  diagonal 
elements  of  the  hat  matrix,  H  =  X'(X'X)~1X.  We  then  sample  e*  from  the  modified 
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Algorithm  1:  Model-based  resampling  for  controlled  calibration. 

for  r  =  1  to  R  do 

(1)  for  i  =  1  to  n  do 

(a)  set  x*  =  Xi] 

(b)  randomly  sample  e*  from  the  centered  residuals  e'1; . . . ,  e!n\ 

(c)  set  y*  =  //  (xiif?)  +  e*; 

end 

(2)  Fit  model  to  (x\,y\) , . . . ,  (x^,y*),  giving  estimates  (3*,  of*; 

(3)  for  j  —  1  to  m  do 

(a)  randomly  sample  e*  from  the  centered  residuals  e[, . . . ,  e'n] 

(b)  set  ylj  =  y0  +  e*; 

end 

(4)  Set  y*r  =  ELi  Vo klm', 

(5)  Set  x*0r  =  y-1  (y^r:  /§;) ; 

if  calculating  an  interval  based  on  the  studentized  bootstrap,  then 
additionally: 

(6)  Compute  se  {x^r}; 

(7)  Set  Wf  =  (x*0r  -  x0)/se{^r}; 

end 


residuals  ry  —  f , . . .  ,rn  —  f.  This  transforms  the  residuals  so  that  they  are  centered 
and  have  constant  variance  of.  For  nonlinear  calibration  curves,  the  residuals  should 
be  corrected  for  bias  in  addition  to  centering  them  (Davison  and  Hinkley,  1997). 
When  there  are  outliers  in  the  residuals,  the  bootstrap  distribution  of  To  can  become 
skewed  or  multimodal.  One  suggestion  is  to  replace  the  residuals  in  step  (3)  with 
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something  smoother,  say  random  variates  from  a  normal  distribution.  For  the  case 
m  >  1,  Jones  and  Rocke  (1999)  suggest  using  a  “residual  pool”  formed  by  e\, . . .  ,en 
plus  residuals  computed  from  the  unknowns:  en+j  =  yn+j  —  yo  (j  =  1,2, ...  ,m).  In 
Section  3.2,  we  mentioned  that  it  is  often  reasonable  to  assume  cq2  =  a\ (.  However, 
if  this  assumption  is  not  valid,  then  we  can  still  proceed  by  resampling  separately 
from  the  two  sets  of  residuals  { e, } t  and  {e, ’]n^+l  (Gruet  and  Jolivet,  1993;  Jones 
and  Rocke,  1999).  For  instance,  we  can  resample  from  {e*}”  in  step  (1)  and  from 
\ejjj_n+ 1  in  step  (3)  of  Algorithm  1.  We  can  also  accommodate  nonconstant  variance 
(i.e. ,  heteroscedasticity)  by  applying  the  wild  bootstrap  (Davison  and  Hinkley,  1997, 
pp.  272).  This  approach  is  discussed  in  Huet  et  ah  (2004,  pp.  142). 

Once  we  have  our  bootstrap  replicates,  a  number  of  different  bootstrap  confidence 
intervals  can  be  constructed.  For  a  studentized  interval,  we  compute  R  bootstrap 
replicates  of  W  (Equation  (3.17)):  W*,  W£ , . . . ,  W^.  This  is  outlined  in  steps  (6)- 
(7)  of  Algorithm  1.  Instead  of  relying  on  normal  theory  assumptions  about  W,  as 
when  computing  the  Wald  interval,  the  bootstrap  estimates  the  distribution  of  W 
directly  from  the  data.  Thus,  instead  of  approximating  the  quantiles  of  W  using  a 
standard  table,  a  “table  is  built  for  the  data  at  hand”  (Efron  and  Tibshirani,  1994). 
An  approximate  100(1  —  a)%  confidence  interval  for  xq  based  on  the  studentized 
bootstrap  is  given  by 

(3.19)  (x0  ~  7i-a/2  ‘  {£o}  ,  -  7a/2  ■  se  {x0})  , 

where  and  7i_0/2  are  the  sample  quantiles  from  the  bootstrap  distribution  of 
W.  One  drawback  of  this  approach  is  that  it  requires  an  estimate  of  the  standard 
error,  se{x^r}.  We  could  use  a  Taylor  series  approximation,  as  in  the  delta  method, 
but  this  is  known  to  be  unreliable  for  small  samples.  Another  alternative  is  to  use 
the  bootstrap  to  estimate  the  standard  error  of  each  x^r.  This  implies  a  second  level 
of  bootstrapping  nested  within  the  first  and  can  be  computationally  expensive.  The 
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studentized  interval  for  controlled  calibration  using  the  large-sample  formula  for  the 
standard  error  (see  Equation  (3.16))  is  discussed  by  Zeng  and  Davidian  (1997a)  and 
Jones  and  Rocke  (1999).  This  interval  can  perform  poorly  in  practice  and  is  easily 
influenced  by  outliers  in  the  data. 

Yet  another  technique  (Jones  and  Rocke,  1999;  Huet  et  ah,  2004)  is  to  bootstrap 
the  predictive  pivot  in  Equation  (3.14): 


It  is  is  easy  to  see  that,  for  the  linear  calibration  problem,  Q *  =  W*;  therefore, 
the  resulting  interval  is  the  same.  For  the  nonlinear  calibration  problem,  however, 
a  100(1  —  a)%  confidence  interval  for  xq  based  on  the  bootstrapping  the  predictive 
pivot  is  given  by  the  set 


JcJx)  =  {%■  C/2  <Q<  q*-a/2}  > 


where  g*/2  and  C-a/2  are  the  «/ 2  and  1  —  ct/2  quantiles  of  the  bootstrap  distribution 
of  Q,  respectively.  This  mimics  the  inversion  method  discussed  in  Section  3.3.1  but 
usually  performs  better  since  we  no  longer  have  to  rely  on  the  approximate  normality 
of  Q.  Therefore,  we  might  think  of  this  as  a  bootstrap  adjusted  inversion  interval. 
According  to  Huet  et  al.  (2004),  this  procedure  performs  reasonably  well  in  practice, 
even  with  small  sample  sizes  and  R  as  small  as  200. 

On  the  other  hand,  we  can  avoid  computing  a  pivot  altogether  by  working  directly 
with  the  bootstrap  replicates  of  xq  from  step  (5)  of  Algorithm  1.  A  simple  and  often 
satisfactory  procedure,  known  as  the  percentile  bootstrap,  uses  the  sample  a/2  and 
1  —  a/2  quantiles  of  . . .  ,XqR  as  a  confidence  interval  for  x0-  A  more  accurate 
technique  is  the  bias- corrected  and  accelerated  ( BCa )  interval,  which  is  essentially 
a  modification  of  the  percentile  interval;  for  details  see  Efron  and  Tibshirani  (1994, 
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chap.  14,  sec.  3).  The  BCa  method  tends  to  work  well,  but  usually  requires  R  to 
be  large.  It  is  both  second-order  accurate  (Efron  and  Tibshirani,  1994,  pp.  187) 
and  transformation  respecting.  For  example,  if  we  want  a  BCa  confidence  interval  for 
log(x0),  we  can  simply  log  the  endpoints  of  the  corresponding  BCa  interval  for  xq.  The 
studentized  interval  is  also  second-order  accurate,  but  not  transformation  respecting, 
while  the  percentile  interval  is  transformation  respecting  but  only  first-order  accurate. 
For  an  in-depth  discussion  regarding  the  different  bootstrap  confidence  interval 
procedures,  see  Davison  and  Hinkley  (1997,  chap.  5). 

A  related  method  for  computing  calibration  intervals,  based  on  the  jackknife, 
was  given  by  Miller  (1974).  Let  /3  =  h(/3)  be  a  function  of  the  regression  parameters. 
Miller  showed  that,  under  reasonable  conditions,  the  jackknife  estimate 

^  ^  n  —  1  n  ^ 

/-(jack  =  n(3  ^  ]  P(—i)i 

2=1 

where  /?(-«)  denotes  the  estimate  of  (3  with  the  i-th  observation  removed  from  the 
data,  is  asymptotically  normally  distributed  (even  if  the  errors  et  are  not  normally 
distributed).  Hence,  an  approximate  100(1  —  a)%  confidence  interval  for  xq  can  be 
developed  that  does  not  require  a  normality  assumption  for  the  errors  of  the  model. 
However,  x0  is  not  just  a  function  of  (3,  but  also  of  y 0.  This  is  not  a  problem  as  long 
as  m  >  1  and  m/n  — *  c,  0  <  c  <  oo.  For  regulation  (Graybill  and  Iyer,  1994,  pp.  431- 
432),  Xq  is  a  function  of  the  regression  parameters  only;  hence,  the  jackknife-based 
interval  is  easily  applied. 

3.4  Bayesian  calibration 

Although  calibration  has  been  the  subject  of  much  discussion  and  debate  from 
a  frequentist  point  of  view,  it  is  just  as  intriguing  from  a  Bayesian  perspective. 
Let  y ,  yo,  and  data  represent  the  data  from  the  standards,  unknowns,  and  both, 
respectively.  Also,  if  x  has  a  nonstandardized  f-distribution  with  mean  //,  precision 
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r  (i.e. ,  recipricol  of  the  variance),  and  k  degrees  of  freedom — denoted  x  ~  t) 
then  the  probability  density  function  of  x  is 


/(x;/r,r,  k) 


r(fc/2  + 1/2)  r  +  (x-rf- 

Y(k/2)\ZrfcK  _  Tk 


—  (fe+l)/2 


J 


for  — oo  <  x  <  oo,  — oo  <  /i  <  oo,  r  >  0,  and  k  >  1. 

Two  influential  papers  on  Bayesian  calibration  are  Hoadley  (1970)  and  Hunter 
and  Lamboy  (1981).  Aitchison  and  Dunsmore  (1980,  chap.  10)  discuss  Bayesian 
solutions  to  the  linear  calibration  problem  for  both  natural  and  controlled  calibration 
experiments.  They  also  briefly  discuss  calibration  under  a  general  utility  structure. 
Dunsmore  (1967)  derived  the  inverse  estimator  (3.6)  as  the  conditional  mean  of 
1 data  assuming  independent  observations  from  a  bivariate  normal  distribution  (i.e., 
Gaussian  data  from  a  natural  calibration  experiment).  For  controlled  calibration 
experiments,  Hoadley  (1970)  argued  that  the  classical  estimator  is  unsatisfactory, 
since  the  width  of  the  inversion  interval  (3.12)  depends  on  the  magnitude  of  the  F- 
statistic  for  testing  the  significance  of  the  slope;  in  other  words,  the  data  contain 
information  about  the  precision  of  x0.  Because  of  this,  Hoadley  argues  that  less 
weight  should  be  given  to  this  estimator  when  it  is  known  to  be  unreliable,  which  is 
what  a  Bayes  estimator  does.  He  proposed  a  class  of  Bayesian  solutions  assuming,  a 
priori ,  that  xq  is  independent  of  (/3o,  /3i,  cr2).  His  results  are  based  on  a  general  prior 
of  the  form 

7T  (/?0,/3i,cre2,a:o)  =  tt  (/30,  Ai,  a2)  vr  (x0) , 


but  he  obtained  explicit  results  for  the  diffuse  prior  7 r  (fio,  Al,  of)  oc  1  /of.  For  the  case 
m  =  1,  he  was  able  to  derive  the  inverse  estimator  (3.6)  as  a  Bayes  estimator  under 
squared  error  loss  with  respect  to  a  nonstandardized  Student’s  t  prior  distribution  for 


x0: 

(3-20)  + 
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This  leads  to  the  very  simple  result 


(3.21) 


Xoldata  --  Tn-2  \  '.|  -77 

I  Oo 


yy 


1  +  i  +  h^hi 

n  Syy 


A  100(1  —  a)%  shortest  credible  interval  for  xq  can  be  obtained  from  (3.21)  and  is 
given  by 


(3.22) 


Although  this  provides  some  theoretical  justification  for  using  the  inverse 
estimator  in  controlled  calibration,  Hoadley  is  aware  that  it  is  merely  a  by-product 
of  the  Bayesian  approach  and  recommends  a  careful  elicitation  of  prior  information. 
Aitchison  and  Dunsmore  (1980,  pp.  198,  204)  give  extensions  of  distributions  (3.20)- 
(3.21)  and  interval  (3.22)  for  the  case  m  >  1.  As  they  point  out,  this  extension  is 
somewhat  nonsensical  since  the  prior  variance  of  Xq  depends  on  m,  the  number  of 
future  replicates.  This  is  rather  unrealistic,  but  “...  tractability  in  these  calibration 
problems  can  lead  to  an  unnecessary  departure  from  reality  ...”  (Aitchison  and 
Dunsmore,  1980,  pp.  198).  Though  many  authors  have  criticized  the  classical 
estimator  on  the  grounds  that  it  has  infinite  MSE,  Hoadley  also  contested  its  use 
based  on  the  inherent  problems  of  the  inversion  interval  (3.10).  These  difficulties, 
however,  do  not  arise  with  the  shortest  credible  interval  (3.22).  Figure  3.5  shows 
the  prior  and  posterior  distributions  of  Xo  for  the  arsenic  example  (Section  3.2.4) 
using  Hoadley’s  approach.  The  mean  of  the  posterior  is  2.945  /ig/rnl,  the  same  as 
the  inverse  estimator,  and  the  corresponding  95%  shortest  credible  interval  for  xq  is 
(2.547,3.342). 


44 


1 


2 


3 


4 


5 


*0 

Figure  3.5:  Bayesian  calibration  for  the  arsenic  example.  The  black  and  green  lines 
represent  the  posterior  and  prior  for  Xq,  respectively.  The  posterior  is  symmetric  and 
centered  at  xq  =  2.945.  The  tick  marks  indicate  the  endpoints  of  a  95%  HPD  interval 
for  xq. 


Hunter  and  Lamboy  (1981)  also  considered  the  linear  calibration  problem,  but 
proposed  a  slightly  different  approach.  Let  rj  =  /30  +  PiX0  and  assume,  a  priori ,  that 
rj  and  (/3o,/%,of)  are  independent.  Assuming  normal  errors,  a  prior  for  the  unknown 
xo  =  (v  ~  Po)/Pi  is  induced  by  specifying  improper  reference  priors  for  77,  p0,  Pi,  and 
of.  That  is,  assuming 

7T  (77,  p0,  pu  al)  =  7 r  (^0,  Pi,  of)  7T  (77)  oc  1/ of. 

This  is  quite  different  from  the  approach  put  forth  by  Hoadley  (1970)  who  does 
not  treat  x0  as  an  explicit  function  of  77,  pQ,  and  Pi.  Hunter  and  Lamboy  also 
discuss  the  case  where  of  is  known,  or  at  least,  assumed  known.  The  posterior  they 
obtain  for  xq  is  equivalent  to  the  posterior  density  for  the  ratio  of  bivariate  normal 
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random  variables  (of  known)  or  bivariate  t  random  variables  (of  unknown),  both  of 
which  have  infinite  variance.  The  latter  is  actually  a  generalization  of  the  structural 
distribution  for  xq  obtained  by  Kalotay  (1971).  A  more  thorough  analysis  of  these 
posteriors  is  given  by  Hunter  and  Lamboy  (1979a)  and  Hunter  and  Larnboy  (1979b). 
Hunter  and  Lamboy  claim  that,  under  reasonable  conditions,  the  inversion  and  Wald 
intervals  (Section  3.3.1  and  Section  3.3.2)  provide  accurate  approximations  to  the 
highest  posterior  density  (HPD)  region  for  xq.  That  said,  Hill  (1981),  Orban  (1981), 
Lwin  (1981),  and  Brown  (1982)  all  noted  that  Hunter  and  Lamboy’s  method  offer 
some  Bayesian  justification  for  the  classical  estimator  and  confidence  intervals. 

Lawless  (1981)  criticized  the  authors  for  not  using  a  more  flexible  family  of 
priors  and  described  the  sole  reliance  on  improper  priors  as  “...  merely  attempting  to 
dress  classical  frequency  procedures  in  Bayesian  clothes.”  Hill  (1981)  made  the  same 
criticism  and  argued  that  a  carefully  selected  gamma  prior  for  [3\  would  have  been 
more  useful  since,  in  practice,  it  is  often  reasonable  to  assume  that  /3\  is  positive  and 
not  too  close  to  zero.  Lawless  and  Hill  also  pointed  out  that  it  is  more  realistic  to 
assume,  a  -priori ,  independence  of  (/30,  /?i ,  erf)  and  x$  (as  did  Hoadley),  rather  than 
independence  of  ( f30 ,  /A,  of)  and  rj  =  /30  +  /3iX0.  Hill  also  pointed  out  that  the  approach 
used  by  Hunter  and  Lamboy  (1981)  is  just  a  special  case  of  that  proposed  by  Hoadley 
(1970)  with 


{| /3i  |  /of ,  of  unknown 

■ 

|/?i|,  of  known 

On  the  other  hand,  Lwin  (1981)  considered  their  approach  somewhat  attractive  since 
it  “...  leads  to  an  analysis  parallel  to  the  well-known  ‘ratio  of  means  problem’  ...”. 
Lwin  also  objects  to  the  use  of  non-informative  priors  and,  in  agreement  with  Hill, 
argues  that  the  posterior  of  xq  should  be  conditional  on  /3 1  >  0.  Another  major 
criticism  was  that  Hunter  and  Lamboy  provide  no  justification  for  their  choice  of 
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locally  uniform  priors,  nor  did  they  seem  interested  in  exploring  any  properties  of  the 
resulting  posterior  distribution  and  HPD  intervals. 

Bayesian  methods  can  easily  be  extended  to  handle  more  complex  situations.  For 
example,  Racine-Poon  (1988)  discusses  the  essential  role  of  calibration  in  assay-type 
problems  where  relationships  are  inherently  nonlinear.  Following  Hoadley  (1970), 
Poon  assumes  that,  a  priori ,  x0  and  (/3,  a 2)  are  independent,  that  is, 

7T  (/3,  cr2,  x0)  =  vr  (/3,  a2)  vr  (x0) . 

The  resulting  posterior  is  then  given  by 

7T  (/3,  cr2,  Xo | data)  oc  7 r  (y0\x0,  / 3 ,  a2)  vr  (y\(3,  a2)  vr  (/3,  a2)  vr  (x0)  • 

Poon  noted  the  substantial  amount  of  numerical  integration  involved  in  obtaining 
the  posterior  of  xq  and  proposed  instead  an  approximation  method.  Given  the 
current  state  of  statistical  software  such  as  R  (R  Development  Core  Team,  2011) 
and  OpenBUGS  (Lunn  et  ah,  2009),  this  is  less  of  an  issue.  A  good  discussion  on 
nonlinear  calibration  problems  using  proper  priors  is  given  by  Hamada  et  al.  (2003). 
Du  Plessis  and  Van  Der  Merwe  (1996)  use  Bayesian  calibration  to  estimate  the  age 
of  rhinoceros — a  multivariate,  nonlinear  calibration  problem. 

3-4- 1  Nasturtium  example. 

We  end  our  discussion  of  Bayesian  calibration  with  a  nonlinear  calibration 
example.  Consider  the  nasturtium  data  from  Racine-Poon  (1988).  The  objective 
was  to  determine  the  concentrations  of  an  agrochemical  (e.g.,  pesticide)  present  in 
soil  samples.  Bioassays  were  performed  on  a  type  of  garden  cress  called  nasturtium. 
The  response  is  weight  of  the  plant  in  milligrams  (mg)  after  three  weeks  of  growth, 
and  the  predictor  is  the  concentration  of  the  agrochemical  in  the  soil.  In  the  first  stage 
of  the  experiment,  six  replicates  of  the  response  W  were  measured  at  each  of  seven 
preselected  concentrations  Xi  (g/ha).  Figure  3.6  shows  a  scatterplot  of  the  standards 


47 


with  concentration  on  the  natural  log  scale  (we  added  0.01  to  the  zero  concentrations 
before  taking  the  logarithm). 


Log  concentration 

Figure  3.6:  Scatterplot  of  the  nasturtium  data  with  fitted  logit-log  model.  The 
horizontal  arrow  corresponds  to  the  observed  yo  =  341.333  mg  and  the  vertical  arrow 
corresponds  to  the  logarithm  of  the  estimated  concentration  log(x0)  =  0.817. 


A  logit-log  regression  function  is  used  to  describe  the  data: 

{  /3i,  x  =  0 

Kx-,  Pu  @2,  @3)  =  < 

^  Pi/  [1  +  exp  {p2  +  @3  ln(x)}] ,  x  >  0, 

The  errors  are  assumed  to  be  normally  distributed  with  mean  zero  and  constant 
variance  of.  Normality  was  checked  using  a  normal  Q-Q  plot  of  the  residuals  and  the 
constant  variance  assumption  appears  reasonable  from  the  scatterplot.  In  the  second 
stage  of  the  experiment,  the  observed  weights  corresponding  to  three  new  soil  samples 
all  sharing  the  same  concentration  xq  were  observed  to  be  309,  296,  and  419  mg.  An 
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estimate  of  the  unknown  concentration  is  obtained  by  inverting  the  fitted  calibration 
curve  and  found  to  be  2.2639.  We  follow  Poon  in  assuming,  a  priori ,  that  xq  is 
independent  of  (fa,  fa,  fa,  of)-  Improper  uniform  priors  were  given  to  the  parameters 
(fa,  fa,  fa,  of),  and  the  prior  for  Xq  was  chosen  to  be  uniform  over  the  experimental 
range:  Xo  ~  W(0, 4).  The  posterior  for  Xo  is  shown  in  Figure  3.7  and  is  nearly  identical 
to  the  approximation  obtained  by  Racine-Poon  (1988,  fig.  9).  A  histogram  of  the 
bootstrap  distribution  of  Xo  is  also  shown  in  Figure  3.7  for  comparison.  The  mean  of 
the  posterior  density  is  2.3434,  and  the  mode  is  2.3124.  A  corresponding  95%  HPD 
interval  for  the  unknown  Xq  is  (1.7566,  3.0011).  For  comparison,  we  also  provide 
the  inversion,  Wald,  and  BCa  intervals  for  Xo  in  Table  3.1.  If  our  prior  information 
accurately  reflects  the  truth,  then  the  HPD  interval  is  preferred.  Otherwise,  the 
inversion  or  bootstrap  interval  are  both  reasonable.  The  bootstrap  distribution  and 
posterior  density  in  Figure  3.7  are  clearly  skewed  to  the  right;  thus,  a  symmetric 
interval,  such  as  the  Wald  interval,  may  not  be  appropriate  here. 


Table  3.1:  Comparison  of  95%  calibration  intervals  for  the  nasturtium  example. 


Inversion  interval 

(1.772,  2.969) 

Wald  interval 

(1.689,  2.839) 

BCa  interval 

(1.805,  2.903) 

HPD  interval 

(1.7566,  3.0011) 
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Density 


Figure  3.7:  Posterior  of  x0  (solid  curve)  together  with  the  bootstrap  distribution  of 
x0  (histogram). 
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IV.  Semiparametric  Calibration 


So  far,  we  have  considered  calibration  curves  in  which  the  mean  response 
fi(x)  =  E  {3^}  has  a  known  form  that  depends  on  a  small  number  of  unknown 
parameters  /3.  Finding  a  good  parametric  model,  however,  can  be  time  consuming 
and  require  a  great  deal  of  expertise.  Therefore,  it  is  sometimes  useful  to  assess  the 
effects  of  the  explanatory  variable  x  without  completely  specifying  the  structural  form 
of  /i{x).  In  this  chapter,  we  propose  a  simple  and  fast  semiparametric  approach  to 
computing  calibration  curves.  By  “semiparametric”,  we  mean  that  only  part  of  the 
model  is  specified.  Our  treatment  of  semiparametric  calibration  curves  follows  the 
work  of  Brumback  et  al.  (1999),  Ruppert  (2002),  and  Ruppert  and  Wand  (2003), 
and  Crainiceanu  et  al.  (2005).  Nonparametric  calibration  has  also  been  discussed 
in  the  literature  by,  for  example,  Clark  (1979),  Clark  (1980),  and  Rosen  and  Cohen 
(1995b).  Our  approach  to  calibration  here  is  similar  to  that  of  Clark  (1980)  in  that  we 
are  inverting  bias-adjusted  prediction  intervals.  However,  the  LMM  representation  of 
P-splines  (Ruppert  and  Wand,  2003)  we  use  here  yields  a  rather  simple  method  for 
making  this  adjustment.  Rosen  and  Cohen  used  a  nonparametric  bootstrap  to  obtain 
calibration  intervals  from  a  cubic  smoothing  spline.  Their  approach,  however,  made 
no  attempt  to  correct  for  bias  in  the  smoothed  calibration  curve. 

In  Section  4.1,  we  discuss  the  linear  mixed-effects  model  representation  of 
P-splines.  Section  4.2  proposes  a  method  for  obtaining  bias-adjusted  calibration 
intervals  based  on  the  mixed  model  representation.  A  small  Monte  Carlo  study 
demonstrates  that  the  these  intervals  have  coverage  probability  close  to  the  nominal 
1  —  a  level.  A  Bayesian  analog  of  this  procedure  is  proposed  in  Section  4.3.  Section  4.4 
concludes  this  chapter  with  a  discussion  on  ideas  for  future  research. 
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4.1  Mixed  model  representation  of  P-splines 

Recall  the  polynomial  spline  model  from  Section  2.4: 

p  K 

(4.1)  y>i  =  ^  Pjxi  +^2ak  (Xi  -  £fc)+  +  €i,  €i  ~  (0,  a2)  ,  2  =  1,...,  n. 

j= 0  k= 1 

We  can  easily  write  this  in  matrix  form,  such  as  in  Equation  (2.13),  but  a  more  useful 
form  is  obtained  by  separating  the  polynomial  and  spline  terms  as  in 

(4.2)  y  =  Xf3  +  Zcl  +  e,  e  ~  (0,  a2l)  , 

where  (3  =  (/3q,  /3\,  . . . ,  /3P)'  are  the  coefficients  of  the  polynomial  basis  functions, 
ol  =  (cci, . . . ,  olk)’  are  the  coefficients  of  the  spline  basis  functions,  and  X  and  Z  are 
known  design  matrices  with  i-th  rows  equal  to 

X,t  =  (1,  Xi ,  x2, . . . ,  2%)'  and  Z%  =  ((: -  Ci)+,  ■  ■  ■ ,  (x{  -  &)+)' , 

respectively.  Although  we  choose  Z  to  be  the  truncated  polynomial  spline  basis 
matrix,  any  other  basis  will  do  (e.g.,  radial  basis,  B-spline  basis,  etc.).  Based  on  the 
matrix  Equation  (4.2),  the  penalized  spline  fitting  criterion,  Equation  (2.14),  becomes 

\\y-  X(3-  Za\\2  +  \2p\\a\\2  , 

which  is  proportional  to  Equation  (2.22),  the  PSS  for  an  LMM  with  hierarchical 
structure  y|a  ~  A f{X/3  +  Za,a2I )  and  cx.  ~  A/^O,  cr^I).  Dividing  through  by  the 
error  variance,  a2,  gives  the  smoothing  parameter  as  X2p  =  a  simple  ratio  of 

the  variance  components.  This  leads  to  the  following  mixed  model  representation  of 
P-splines  (Brumback  et  ah,  1999): 


( 

\ 

OL 

0 

all  0 

y  =  Xf3  +  Zol  +  e, 

~  A/" 

e 

V 

0 

0  a2I 

) 

The  parameters  /3,  ck,  a2,  and  a2  can  all  be  estimated  using  standard  mixed  model 
methodology  and  software.  Under  this  framework,  a  P-spline  is  really  just  the  BLUP 
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from  a  special  LMM!  The  amount  of  smoothing  is  determined  automatically  by 
A  =  of/of,  where  of  and  of  are  the  REML  estimates  of  of  and  of,  respectively. 
Although  A  could  be  estimated  via  ordinary  ML  estimation  or  cross-validation, 
Krivobokova  and  Kauermann  (2007)  showed  that  the  REML-based  estimate  is  less 
affected  by  the  presence  of  different  correlation  structures  for  the  errors,  e.  The  BLUP 
of  pi,  denoted  /5,  was  given  in  Equation  (2.24);  however,  an  equivalent  expression  for 
pi  is  given  by 

=  n  (n'n  +  ^ d^J  ny  =  sy,  n  =  (x-  z) , 

where  D  =  diag  {O^+px^+i),  Ikxk}-  The  vector  of  fitted  values  (i.e. ,  the  EBLUP 
of  pi)  is  then  just 

ny, 

where  of  and  of  are  the  REML  estimates  of  of  and  of ,  respectively. 

As  described  in  Robinson  (1991),  the  LMM  has  a  simple  Bayesian  analog.  For 
the  mixed  model  representation,  Equation  (4.3),  if  /3,  of,  and  of  are  all  given 
improper  uniform  priors,  then  the  posterior  of  pi  is  J\f  (pi,  of  S').  Note  the  use  of  the 
variance- covariance  matrix  of  S’  rather  than  the  variance-covariance  matrix  of  AS" 
from  Equation  (2.16).  It  is  easy  to  see  that  [S]tJ  >  hence,  confidence  and 

prediction  intervals  based  on  the  Bayesian  variance-covariance  matrix  of  S'  are  wider 
than  those  based  on  a^SS1.  As  pointed  out  by  Hastie  and  Tibshirani  (1990),  the 
“extra  wideness”  is  due  to  the  fact  that  S  accounts  for  squared  bias  whereas  SS' 
does  not. 

4.2  Bias-adjusted  calibration  intervals 

As  mentioned  in  Section  2.4.1,  the  estimated  mean  response  is  biased. 

Inference  for  P-splines  based  on  the  LMM  representation,  Equation  (4.3),  differs 
depending  on  whether  we  take  the  randomness  of  a.  into  account.  Ignoring  the 


R 


=  f2  STA2  + 
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randomness  in  a  leads  to  the  same  intervals  given  in  Section  2.4.1.  We  can  account 
for  bias  in  the  confidence  and  prediction  intervals,  however,  by  conditioning  on  the 
random  effects  a.  To  see  this,  note  that  the  bias  vector  of  ft,  conditional  on  ot,  is 


(4.4) 


E  {n  —  /^|ck}  =  X  E  j/3|aj  —  (3  +Z  E{a|a} 


a 


From  the  properties  of  conditional  expectation  (Casella  and  Berger,  2002,  pg. 
164),  we  have  that  E  |e  ^/3|a^  |  =  e|/3|  and  E{E(fi|o:)}  =  E{cf}.  But  since 
E  |  (3 1  =  (3  and  E  {a}  =  E  {a}  =  0,  the  unconditional  bias  is  just 

E  {E  (/I  —  n\ a)}  =  X  (/3  —  (3)  +  Z  (0  —  0)  =  0. 


Thus,  J1  is  unbiased  for  /i  when  averaged  over  the  distribution  of  ot.  This  is  equivalent 
to  the  Bayesian  approach  where  (pointwise)  confidence  and  prediction  bands  use 
diagonal  entries  of  S  in  place  of  the  diagonal  entries  of  SS'  described  in  Section  2.4.1. 

Let  fi0  =  (I;  xo ,  •  •  • ,  Xq,  (x0  —  £i)+, . . . ,  (x0  —  £k)+)  where  x0  is  an  arbitrary 
value  of  the  explanatory  variable  x.  A  100(1  —  a)%  (bias-adjusted)  confidence  interval 
for  h(xq)  is  given  by 

H(xo)  T  tl—a/2 ,df  X  ce^ CIq  D ^  L2g, 

where  df  =  n  —  2  tr  (S)  +tr  ( SS' ).  In  a  similar  manner,  a  100(1  —a)%  (bias-adjusted) 
prediction  interval  for  a  new  observation,  y  =  /j,(x0)+e0  (independent  of  current  ones), 
is  given  as 

(4.5)  h^o)  ±  ti-a/2,df  x  l  +  ^o  fig. 

Recall  the  intuition  behind  the  inversion  interval.  Let  Ipred{x )  be  a  100(1  —  a)% 
prediction  interval  for  the  future  observation  3^0,  such  that  Pr  {3^o  G  ^pred^l^o  =  a;}  = 
1  —  a.  Then,  a  confidence  interval  for  xq  corresponding  to  an  observed  (To  is  just  the 
set  Jc&\(x)  =  {x  :  3^o  G  -fpred(^)}  since,  by  construction, 

Pr  {x0  G  j7'cai(3;o)}  =  Pr  {3^o  G  Ipred(x)\x0  =  x}  =  1  -  a, 
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(Clark,  1980).  In  other  words,  a  100(1  —  a)%  confidence  interval  for  x0  can  be  obtained 
by  inverting  a  corresponding  prediction  interval  for  3V  However,  we  need  to  invert 
an  appropriate  prediction  interval  (i.e.,  one  that  adjusts  for  bias);  otherwise,  the 
resulting  confidence  interval  for  xq  will  likely  not  be  reliable.  To  that  end,  we  propose 
the  following  bias-adjusted  100(1  —  a)%  calibration  interval  for  the  unknown  Xo'- 


(4.6)  JSM  =  < 


Xo  '■  ta/2,df  < 


Vo  -  Kxo) 


<  C- 


1  +  f20  flfl  +  %D 


-1 


Q'r 


1  -a/2,df 


As  before,  great  care  should  be  taken  to  ensure  that  J^{x o)  is  an  interval.  This 
may  not  be  the  case,  for  example,  when  fj,(x)  in  not  monotonic  over  the  range  of 
interest.  Also,  as  with  nonlinear  calibration,  no  closed-form  expression  exists  for 
Jc^ix)  and  iterative  techniques  are  required.  In  a  similar  fashion,  a  100(1  —  a)% 
bias-adjusted  regulation  interval  for  x0  is  obtained  by  inverting  a  corresponding  bias- 
adjusted  confidence  interval  for  the  mean  response: 


<^reg(Xo)  —  |  ^0  :  ta/2,df  <  ,  ^  ^  <  H-a/2,d/ 

[  ^en0  (Q'Q  +  |£>)  S7'0 

We  designed  a  small  Monte  Carlo  study  to  investigate  the  coverage  probability 
of  this  technique.  The  results  of  this  study  are  presented  in  Tables  4.1  and  4.2. 
The  true  calibration  function  is  the  sine  wave  /i(x)  =  sin(7nr  —  7t/2)/2  +  1/2  plotted 
in  Figure  4.1.  For  data  following  such  a  pattern,  it  would  be  tempting  to  fit  a 
nonlinear  model,  perhaps  modeling  /j,(x)  as  a  logistic  function.  Inference  based  on 
the  resulting  fit  would  then  be  inaccurate  since,  for  example,  the  true  calibration 
curve  does  not  have  any  asymptotes.  For  situations  like  this,  it  would  be  useful  to 
have  a  semiparametric  alternative  for  which  to  compare  inference. 
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X 

Figure  4.1:  Scaled  sine  wave  function. 

Notice  that  the  sine  wave  plotted  in  Figure  4.1  is  scaled  to  lie  in  [0, 1]  x  [0, 1]  with 
a  period  of  2.  For  each  of  1,000  simulations,  we  used  10,  30,  and  50  uniformly  spaced 
designed  points  on  the  domain  [0, 1]  with  1,  2,  and  3  independent  replicates  of  the 
response  at  each  design  point.  The  errors  were  generated  as  i.i.d.  J\f( 0,  0.052)  random 
variates  and  the  true  unknown  was  chosen  to  be  xq  =  0.75  (hence,  /io  ~  0.8536). 

Tables  4.1  and  4.2  display  the  estimated  coverage  probability  for  this  technique 
applied  to  the  sine  wave  experiment  for  both  calibration  (Table  4.1)  and  regulation 
(Table  4.2).  For  comparison,  we  also  transformed  the  data  to  an  equivalent  linear 
model  and  applied  Ficller’s  method  (which  yields  an  exact  95%  confidence  interval  for 
xo).  Clearly,  our  bias-adjusted  intervals  performed  well  for  quadratic  (degree  =  2)  and 
cubic  (degree  =  3)  P-splines  with  increasing  sample  size  without  sacrificing  interval 
length.  Notice,  however,  that  the  calibration  results  are  somewhat  conservative  (i.e. , 
the  coverage  probability  is  slightly  larger  than  0.95)  while  the  regulation  intervals 
tend  to  hit  the  target  coverage  1  —  a  =  0.95.  The  reason  for  this  is  that  the  bias  in 
predicting  a  future  observation  is  the  same  as  that  for  estimating  the  mean  response, 
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but  the  former  has  larger  variance.  Therefore,  the  bias  accounts  for  a  smaller  portion 
of  the  mean-squared  error  in  prediction  (i.e.,  the  bias  gets  “washed  out”  by  a  larger 
variance).  Thus,  for  larger  samples  (perhaps  n  >  20  with  m  >  1  replicates  at  each 
design  point),  calibration  intervals  could  be  computed  with  or  without  adjusting  for 
bias.  Regulation  intervals,  on  the  other  hand,  should  always  be  adjusted  for  bias. 

Table  4.1:  Coverage  and  length  of  95%  calibration  intervals  for  data  from  the  sine 
wave  experiment  with  cre  =  0.05. 


P-spline  ( p  =  1)  P-spline  ( p  =  2)  P-spline  ( p  =  3)  Fieller 


n 

m 

CP 

Length 

CP 

Length 

CP 

Length 

CP 

Length 

10 

1 

0.93 

0.25 

0.93 

0.21 

0.90 

0.19 

0.95 

0.25 

10 

2 

0.96 

0.21 

0.97 

0.20 

0.96 

0.19 

0.96 

0.22 

10 

3 

0.97 

0.21 

0.96 

0.20 

0.97 

0.19 

0.96 

0.21 

30 

1 

0.95 

0.21 

0.95 

0.20 

0.97 

0.18 

0.96 

0.21 

30 

2 

0.96 

0.20 

0.96 

0.19 

0.97 

0.18 

0.95 

0.20 

30 

3 

0.97 

0.20 

0.96 

0.19 

0.97 

0.18 

0.96 

0.20 

50 

1 

0.97 

0.20 

0.96 

0.19 

0.96 

0.18 

0.96 

0.20 

50 

2 

0.97 

0.20 

0.97 

0.19 

0.96 

0.18 

0.95 

0.20 

50 

3 

0.98 

0.20 

0.97 

0.19 

0.97 

0.18 

0.96 

0.20 
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Table  4.2:  Coverage  and  length  of  95%  regulation  intervals  for  data  from  the  sine 
wave  experiment  with  ae  =  0.05. 


Degree  =  1 

Degree  =  2 

Degree  =  3 

Fiellcr 

n 

m 

CP 

Length 

CP 

Length 

CP 

Length 

CP 

Length 

10 

1 

0.74 

0.13 

0.92 

0.12 

0.92 

0.11 

0.95 

0.10 

10 

2 

0.89 

0.09 

0.94 

0.09 

0.94 

0.08 

0.95 

0.06 

10 

3 

0.91 

0.07 

0.95 

0.07 

0.94 

0.06 

0.95 

0.05 

30 

1 

0.96 

0.07 

0.95 

0.07 

0.94 

0.06 

0.95 

0.05 

30 

2 

0.97 

0.06 

0.95 

0.05 

0.94 

0.04 

0.95 

0.04 

30 

3 

0.97 

0.05 

0.95 

0.04 

0.95 

0.03 

0.95 

0.03 

50 

1 

0.96 

0.06 

0.95 

0.05 

0.95 

0.05 

0.95 

0.04 

50 

2 

0.97 

0.04 

0.96 

0.04 

0.95 

0.03 

0.95 

0.03 

50 

3 

0.97 

0.04 

0.96 

0.03 

0.95 

0.03 

0.95 

0.02 

4-2.1  Whiskey  age  example. 

The  point  of  this  example  is  to  demonstrate  how  our  bias-adjusted  semiparamet- 
ric  approach  to  calibration  compares  against  a  simpler  parametric  model  (when  one 
is  available).  To  this  end,  we  analyze  the  whiskey  data  from  Schoeneman  et  al.  (1971) 
presented  in  Figure  4.2.  The  data  give  the  proof  (measured  as  twice  the  percentage  of 
alcohol  by  volume,  denoted  2ABV)  of  whiskey  stored  in  a  charred  oak  barrel  against 
time  in  years;  clearly,  some  curvature  is  present.  Suppose  a  new  sample  is  obtained 
from  the  same  barrel  and  that  the  proof  is  observed  to  be  y$  =  108  2ABV.  ft  is 
desired  to  estimate  how  long  this  batch  has  been  aged. 

##  Loading  required  package:  rootSolve 
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Figure  4.2:  Scatterplot  of  the  whiskey  data. 


We  begin  by  fitting  a  simple  quadratic  P-spline  with  five  knots: 

5 

Mag ej  =  A)  +  Mi age j  +  /32proof  ■  +  ^  ak  (proof  ■  -  £fc)+ . 

k= 1 

The  number  of  knots  and  their  placement  were  determined  automatically  using  the 
methods  described  in  Section  2.4.  The  fitted  model  is  depicted  in  the  right-hand  side 
of  Figure  4.3.  It  appears  from  the  scatterplot  that  a  simple  quadratic  may  be  sufficient 
(i.e. ,  dj  =  0,  i  —  1, . . . ,  5);  this  fit  is  depicted  in  the  left  side  of  Figure  4.3.  Both  models 
are  different,  but  the  fits  are  indistinguishable  to  the  human  eye.  In  fact,  both  models 
produce  identical  calibration  intervals  for  x0  (when  rounded  to  four  decimal  places). 
For  the  linear  model,  we  have  Xq  =  5.2329  years  with  a  95%  confidence  interval  for 
xq  of  (4.6776,5.7352).  Similarly,  for  the  P-spline  model,  we  have  xq  =  5.2329  years 
with  a  95%  (bias-adjusted)  confidence  interval  for  x0  of  (4.6776,5.7352).  The  reason 
we  get  the  same  answer  from  both  models  is  that  the  polynomial  basis  part  of  the 
penalized  spline  model  is  sufficient  for  modeling  the  curvature  of  //(age),  therefore, 
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the  spline  basis  terms  are  effectively  “zeroed  out”  (see  Figure  4.4).  Although  our 
P-spline  approach  to  obtaining  calibration  intervals  requires  large  samples,  in  this 
small  sample  example,  they  happen  to  coincide  with  the  classical  inversion  limits 
from  the  quadratic  linear  model.  If  we  did  not  correct  for  bias  in  the  P-spline  model, 
however,  the  confidence  interval  obtained  for  xo  would  be  (4.7223,5.7016),  which  is 
too  narrow. 


o 

o 


112 


110 


108 


106 


104 


Figure  4.3:  Fitted  models  for  the  whiskey  data.  Left:  Quadratic  linear  model  with 
95%  (pointwise)  prediction  band.  Right:  Quadratic  P-spline  with  five  knots  and  95% 
(pointwise)  bias-adjusted  prediction  band. 
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Figure  4.4:  Profiles  of  spline  coefficients  for  the  P-spline  in  Figure  4.3,  as  the 
smoothing  parameter  A  is  varied  from  zero  to  three.  Coefficients  are  plotted  versus 
the  smoothing  parameter  A;  the  REML  estimate  is  A  =  385.2228. 


4.3  Bayesian  semiparametric  calibration 

The  mixed  model  P-spline,  Equation  (4.3),  provides  a  simple  method  for 
estimating  the  smoothing  parameter  and  accounting  for  bias  in  calibration  intervals. 
These  intervals,  however,  do  not  account  for  the  variability  of  the  estimated  smoothing 
parameter  A  =  Sf/cf2  (similar  to  inference  in  LMMs  which  typically  ignores  the 
variability  of  V).  This  problem  can  be  remedied  by  adopting  a  fully  Bayesian 
approach  which  we  now  discuss. 

Let  7r(-)  denote  a  probability  density  function.  Following  Hoadley  (1970),  we 
assume  that  the  calibration  experiment  contains  no  information  about  x0  and  that 
the  priors  for  Xo  and  the  calibration  experiment  are  independent;  thus, 

7r(x0,/3,tt,cre2,oi)  =  7r(xoM/3,tt,o-2,a2). 
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The  mixed  model  representation,  Equation  (4.3),  has  ol  rv_/  A f(0,  ail).  A 
fully  Bayesian  approach,  however,  requires  a  prior  distribution  on  the  parameters 
(/3,  of ,  of ,  o).  Following  standard  convention,  we  assume,  a  priori,  that  the  fixed- 
effects  are  independent  and  assign  vague,  independent  priors  to  (/3,  of,  erf ).  We  used 
the  vague  (but  proper)  priors 

/3j  -  J\f  (0,  of)  ,  j  =  l,...,p  +  l, 
of  ~  (a,  6) , 

of  ~  (c>  , 

where  IQ  stands  for  the  inverse  gamma  distribution.  The  variance  of  should  be 
chosen  large  enough  (say  106)  so  that  (for  all  intents  and  purposes)  the  /ffs  are 
uniform.  Similarly,  the  parameters  a,  b ,  c,  and  d  should  be  small  (say  10~6).  These 
distributions  can  be  considered  as  an  approximate  representation  of  vagueness  in  the 
absence  of  good  prior  information.  In  addition,  we  assume  that  the  prior  for  Xq,  the 
predictor  value  of  interest,  is  uniform  over  the  experimental  range:  Xq  ~  U[a,b\. 

The  (unnormalized)  posterior  density  of  (x0,  (3,  «,  of,  of )  is  given  by 

7r(xo,/3,Q:,cre2,cr^|data)  =  tt(x0,  (3,  a,  cf,  of  |y,  y0) 

oc  n(y,  y0\x0,  (3,  ol,  cr2,  of  )n(x0,  /3,  a,  of,  of) 

oc  7r(y0\x0,  {3 ,  ol,  of ,  of  )7r(t/|/3,  a,  of,  of  )tt(/3,  «,  of,  cf  )7r(x0) 

oc  vr(?/oko,  (3 ,  ol,  a2e)n{y\{3,  ol,  of  )vr(/3)7r(a|of  )vr(of  )?r(cf  )7r(x0), 

where  y  and  yo  represent  the  observed  data  from  the  first  and  second  stages  of  the 
calibration  experiment,  respectively.  It  is  relatively  straightforward  to  show  that  (see 
Section  A.l  in  the  appendix)  the  conditional  posterior  of  (/3,  a)  is  proportional  to 

exp  j-^2  (bo  -  X0{3  -  Z0a ||2  +  ||ck||2^  j  , 
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where 


y 

X 

z 

,  x0  = 

,  Zo  — 

yo 

x'o 

zo 

V  o  = 


are  augmented  data  vectors  and  matrices.  Upon  completing  the  square  we  have  that 


o  \  —  1  /  2  \  — 1 


/3,c*|?/o,:y,U),0e2,0a  ~  ■A/'s  f&o00+-|£>  fjgt/o, of  fJ'0Oo  +  -4.D 


o’. 


where  f2o  =  {Xq,Zq).  In  a  similar  fashion,  the  conditional  posteriors  of  of  and  erf 
are  the  inverse  gammas: 


o'l 1 2/o,  ?/,  /3,  ~  Z£  (  a  +  b  +  ^  ||y0  -  -^o/3  -  ^o«||2  ,  , 

K  1 

0a|yo,3/,/3,at,0e,a;o  (  c  +  — ,d  +  -  ||o:||2 


where  /l  is  the  number  of  knots  or  random  effects  (see  Section  A. 2  in  the  appendix). 
The  conditional  posterior  of  Xo  is  more  difficult  to  obtain  analytically,  however, 
regardless  of  the  prior  7r(xo).  This  makes  it  difficult  (or  impossible)  to  obtain  a 
full  Gibbs  sampler  here,  nonetheless,  we  can  sample  from  the  posterior  of  Xo  using 
more  specialized  Markov  Chain  Monte  Carlo  methods  such  as  the  Metropolis- Hastings 
algorithm  (see,  for  example,  Robert  and  Casella  (2004,  chap.  7)).  Tims,  as  discussed 
in  (Gelman  et  ah,  2003,  pg.  292),  we  could  update  the  parameters  one  at  a  time  using 
Gibbs  sampling  for  (/3,  a,  of ,  cr^)  and  a  metropolis  update  for  x0.  We  illustrate  this 
approach  with  the  following  example  involving  radioimmunoassays.  The  data  were 
analyzed  using  the  JAGS  software  within  R  via  the  rjags  package. 

4-3.1  Enzyme-linked,  immunosorbent  assay  (ELISA)  example. 

Rosen  and  Cohen  (1995a)  derived  a  95%  calibration  interval  for  the  unknown 
concentration  in  a  radioimmunoassay  problem  using  the  nonparametric  bootstrap. 
We  demonstrate  our  method  on  the  same  dataset  and  compare  the  results.  The  data, 
plotted  in  Figure  4.5,  consists  of  23  distinct  concentrations  with  four  (independent) 
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replicates  of  the  response  at  each  concentration.  The  results  from  analyzing  these 
data  are  summarized  in  Table  4.3  at  the  end  of  this  section. 


Figure  4.5:  Scatterplot  of  the  ELISA  data. 


The  four  parameter  logistic  model, 

P2  ~  Pi 


yi  —  Pi  + 


+  £*,  €i  ~  Af  (0,  of)  ,  i  =  l,...,n, 


1  +  exp  {Pi(\ogXi-  Pz)} 
provides  a  reasonable  parametric  fit  model  to  the  ELISA  data.  Although  we  assume 
the  errors  are  normal  with  constant  variance,  Rosen  and  Cohen  (1995a)  more 
generally  assumed  e*  ~  (0,  erf1) ,  where  cq  =  cr  (n(xi,  (3))9 .  This  adds  an  unnecessary 
complication  to  the  calibration  model  and  so  we  simply  assume  constant  variance 
(standard  residual  plots  do  not  reveal  any  serious  indication  of  heteroscedastic  errors). 
Following  Rosen  and  Cohen  (1995a),  we  assume  we  have  a  new  observation  y0  =  20 
with  unknown  concentration  xq.  The  frequentist  (i.e. ,  classical)  estimate  of  Xq  is 
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rather  straightforward  to  derive 


x0  =  exp  <(  —  log 

/?4 


^2  -  VO 


+  /?3  >  =  9.1837. 


KVo  ~  Pi  t 

The  95%  inversion  limits  for  xq  based  on  Equation  (3.15)  are  (7.356, 11.657) — see 
Figure  4.6. 


0  10  20  30  40  50 
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Figure  4.6:  Nonlinear  least  squares  fit  of  the  ELISA  data  to  the  four  parameter  logistic 
model  with  (pointwise)  95%  prediction  band. 

Figure  4.7  shows  a  fitted  semiparametric  calibration  curve,  a  quadratic  P-spline 
with  five  interior  knots.  Also  shown  is  the  posterior  mean  based  on  a  fully  Bayesian 
P-spline  model.  Seeing  as  how  the  concentration  should  not  be  negative,  we  used  a 
U[ 0, 50]  prior  for  xq,  though,  lognormal  or  gamma  priors  may  also  be  reasonable.  The 
frequentist  estimate  of  Xq  is  obtained  by  (numerically)  inverting  the  fitted  P-spline: 
xq  =  9.345.  A  (bias-adjusted)  inversion  interval  for  xq  based  on  Equation  (4.6) 
is  (7.462,11.711).  As  previously  mentioned,  this  interval  does  not  account  for 
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the  variability  of  the  estimated  smoothing  parameter  a* /a20l  hence,  we  expect  the 
Bayesian  credible  interval  to  be  slightly  wider.  The  estimator  we  use  for  the  Bayesian 
model  is  the  posterior  mode  of  Xq  which  is  equal  to  9.285.  There  are  a  number 
of  ways  to  compute  a  95%  credible  interval  from  a  given  posterior;  for  example, 
highest  posterior  density  (HPD)  intervals.  Here  we  simply  report  the  0.025  and  0.975 
quantiles  of  the  posterior  which  are  7.250  and  11.776,  respectively.  As  noted  earlier, 
these  limits  are  slightly  wider  than  the  corresponding  frequentist  limits,  but  this  extra 
wideness  is  likely  due  to  the  added  uncertainty  of  the  estimated  smoothing  parameter. 
Rosen  and  Cohen  (1995a)  also  obtained  calibration  intervals  for  these  data.  Their 
approach  involved  bootstrapping  a  cross- validated  cubic  smoothing  spline,  but  did  not 
account  for  bias  in  /2(x).  For  a  brief  discussion  on  bias  correction  in  nonparametric 
regression  using  the  bootstrap,  see  Davison  and  Hinkley  (1997,  pp.  362-366). 


Figure  4.7:  Nonparametric  calibration  for  ELISA  data.  Left:  Bayesian  P-spline  with 
posterior  and  95%  credible  interval  for  xq.  The  shaded  blue  region  represents  the 
(pointwise)  prediction  band  and  the  density  curve  represents  the  posterior  of  x0. 
Right :  Mixed-effects  model  P-spline  with  bias-adjusted  95%  calibration  interval  for 
xq.  The  shaded  blue  region  represents  the  (pointwise)  bias-adjusted  predction  band. 
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Table  4.3:  Point  and  interval  estimates  for  Xq  for  the  ELISA  data  with  y0  =  20. 


Method 

Estimate 

95%  interval 

Parametric  (homoscedastic  errors) 

9.184 

(7.356,11.657) 

P-spline  (bias-adjusted) 

9.345 

(7.462,11.711) 

P-spline  (Bayesian) 

9.285 

(7.250,11.776) 

4.4  Discussion 

We  have  proposed  a  new  method  for  calibration  in  a  semiparametric  setting  based 
on  the  mixed  model  approach  to  smoothing  that  corrects  for  bias  in  the  smoothed 
calibration  curve.  While  the  idea  of  using  LMMs  for  penalized  smoothing  is  not 
new  (see,  for  example,  Demidenko  (2013,  pp.  13  -  17)),  this  approach  has  never 
been  adapted  for  the  calibration  problem  as  we  have  proposed  here.  By  using  a 
simple  Monte  Carlo  experiment,  we  have  demonstrated  that  bias-adjusted  prediction 
intervals  can  be  used  to  (numerically)  obtain  calibration  intervals  for  the  unknown  Xo 
that  have  good  coverage  probabilities.  We  have  discussed  a  situation  where  this  bias- 
correction  is  more  serious  (i.e.,  regulation,  or  rather,  calibration  where  the  unknown 
xo  corresponds  to  some  specified  mean  response  /io).  Finally,  we  developed  a  Bayesian 
framework  for  semiparametric  calibration  by  extending  the  approach  originally  put 
forth  by  Hoadley  (1970)  for  linear  calibration  (see  Section  3.4).  The  frequentist 
framework,  while  fast  and  simple,  has  the  same  disadvantage  inherent  in  LMM 
inference;  that  is,  the  variance  of  the  estimated  smoothing  parameter  (which  is  a 
function  of  the  estimated  variance  components)  is  ignored.  The  Bayesian  approach 
handles  this  by  incorporating  prior  information  for  all  unknown  parameters  in  the 
model,  leading  to  slightly  wider  calibration  intervals.  The  Bayesian  approach  we 
presented,  however,  is  slower  and  more  difficult  to  implement  in  practice. 
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4-4-1  Priors. 

Although  we  used  a  uniform  prior  for  Xq  in  our  examples,  such  a  prior  for  x  is 
unlikely  to  be  useful  in  general  and  we  recommend  a  more  careful  elicitation  of  prior 
information.  A  sensitivity  analysis  should  also  be  carried  out  to  see  if  the  results  are 
sensitive  to  the  choice  of  prior  for  Xq.  Although  inverse  gamma  priors  are  commonly 
used  in  practice  as  noniuformative  priors  for  scale  parameters  in  hierarchical  models 
(e.g.,  the  variance  components  in  a  LMM),  Gelman  (2006)  argued  against  their  use. 
Instead,  Gelman  adovcated  the  use  of  conditionally  conjugate  priors  from  a  new 
folded-noncentral-f  family. 

4-4-2  Future  work. 

The  methods  proposed  in  this  chapter  open  the  door  to  a  number  of  future 
research  opportunities.  Perhaps,  the  most  interesting  (and  logical)  next  step  would 
be  the  inclusion  of  constraints.  For  example,  we  have  assumed  that  the  calibration 
curve  is  monotonic  over  the  range  of  interest.  Fortunately,  this  is  not  usually  a  concern 
when  the  data  are  collected  from  a  carefully  designed  experiment.  The  semiparametric 
fit,  however,  is  not  necessarily  monotonic  for  every  value  of  the  smoothing  parameter 
which  may  cause  problems  when,  say,  obtaining  a  bias-adjusted  calibration  interval. 
It  is  possible,  however,  to  incorporate  constraints,  such  as  monotonicity,  using  the 
general  projection  method  described  in  Mammen  et  al.  (2001).  Until  such  a  constraint 
can  be  smoothly  incorporated  (no  pun  intended)  into  our  semiparametric  approach 
to  calibration,  we  can  instead  rely  on  inference  regarding  the  first  derivative  of  /  as 
described  in  Ruppert  and  Wand  (2003,  pp.  151-156).  For  instance,  if  we  assume 
that  /  is  monotonically  increasing  (decreasing)  over  the  interval  [a,  b],  then  a  plot  of 
the  estimated  first  derivative  of  the  regression  function  should  lie  completely  above 
(below)  the  x-axis.  This  derivative  function  can  be  estimated  in  exactly  the  same 
way  as  the  regression  function  itself,  that  is,  using  P-splines.  Therefore,  a  thorough 
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calibration  analysis  might  include  a  plot  of  the  data  with  fitted  mean  response, 
supplemented  by  a  plot  of  the  estimated  first  derivative  function  (possibly  with  a 
confidence  band).  For  instance,  we  might  supplement  our  analysis  of  the  ELISA  data 
with  the  following  graphic: 


Figure  4.8:  Estimate  of  the  first  derivative  of  the  regression  function  for  the  ELISA 
example  with  a  95%  global  confidence  band.  A  horizontal  reference  line  is  displayed 
at  zero  on  the  y- axis.  This  plot  was  produced  using  the  R  package  SemiPar. 
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V.  Calibration  with  Grouped  Data 


In  this  chapter,  we  extend  the  application  of  calibration  to  grouped  data ;  that 
is,  data  in  which  the  observations  are  grouped  into  disjoint  classes  called  clusters 
or  groups.  Common  examples  of  grouped  data  include  repeated  measures  data  and 
longitudinal  data.  Groups  tend  to  be  homogeneous,  therefore,  observations  belonging 
to  the  same  group  cannot  be  considered  independent.  (Although,  observations 
between  clusters  usually  are.)  Thus,  we  need  to  account  for  within  cluster  dependence 
when  modeling  this  type  of  data.  To  our  knowledge,  other  than  Oman  (1998),  very 
little  has  been  done  for  calibration  with  grouped  data.  Oman  considered  a  simpler 
model  that  only  allowed  for  the  intercept  and  slope  to  vary  between  groups,  whereas 
we  take  a  more  general  (and  practical)  approach  that  allows  for  an  arbitrary  random 
effects  structure.  Furthermore,  while  Oman  considers  only  one  type  of  calibration 
interval,  we  discuss  four  different  calibration  intervals  that  can  be  computed  for 
grouped  data,  along  with  some  adjustments  to  improve  their  accuracy.  Moreover,  the 
calibration  interval  considered  by  Oman  was  based  on  an  approximate  parametric 
bootstrap  that  did  not  account  for  the  variance  attributed  by  the  random  variable 

3V 

The  LMM  was  introduced  in  Section  2.5.  In  Section  5.1,  we  discuss  a  particular 
useful  LMM:  the  random  coefficient  model.  In  Section  5.2,  we  propose  a  simple 
method  for  estimating  the  unknown  Xo  in  a  mixed  model  setting.  We  argue  the 
utility  of  this  approach  by  showing  that,  in  a  particular  case,  it  coincides  with 
the  ML  solution.  In  Sections  5.3  and  5.4,  we  discuss  construction  of  Wald  and 
asymptotic  inversion  intervals,  respectively.  We  then  propose  a  fully  parametric 
bootstrap  approach  for  controlled  calibration  in  Section  5.5.  Unlike  Oman  (1998), 
our  parametric  bootstrap  algorithm  does  take  into  account  the  variability  attributed 
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by  3^o-  Distribution-free  calibration  intervals  are  (briefly)  considered  in  Section  5.6. 
Finally,  Section  5.8  applies  the  aforementioned  techniques  to  a  real  dataset  taken 
from  Brown  (1993). 

5.1  LMMs  for  repeated  measures  data 

As  discussed  in  Section  2.5,  the  LMM  extends  the  basic  LM  (2.2)  to 

(5.1)  y  =  Xf3  +  Zol  +  e, 

where  X  and  Z  are  known  design  matrices,  (3  is  a  vector  of  fixed  effects,  a  is  a  vector 
of  random  effects,  and  e  is  a  vector  of  random  errors.  Since  we  are  using  mixed  models 
to  analyze  grouped  data,  it  would  behoove  us  to  decompress  model  (5.1)  into  the  form 
introduced  by  Laird  and  Ware  (1982), 

(5.2)  y,t  =  Xi/3  +  ZiOLi  +  €j,  i  =  1, . . . ,  m, 
where,  for  the  i-th  group: 

•  y,  is  an  n%  x  1  vector  of  response  variables; 

•  Xi  and  Z%  are  known  design  matrices  of  dimensions  nl  x  p  and  nl  x  q, 
respectively; 

•  {3  is  a  p  x  1  vector  of  fixed  effects; 

•  cx.i  is  a  q  x  1  vector  of  random  effects; 

•  Si  is  an  nt  x  1  vector  of  random  errors. 

Furthermore,  it  is  assumed  that  the  random  variables  { e*?; } {  and  are 

mutually  independent  and  distributed  according  to 

(Xi  ~  W  (0,  G) ,  (i  ~  Af  (0,  a2eRi)  . 
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For  our  purposes,  we  shall  assume  that  Rt  =  I  (an  N  x  N  identity  matrix);  that  is, 
we  are  assuming  constant  variance  within  groups.  Also,  for  computational  purposes 
(e.g.,  maximizing  the  likelihood),  it  is  convenient  to  reparameterize  G  as  oy  Gb  where 
Gt  is  the  “scaled”  variance-covariance  matrix  for  the  random  effects.  In  other  words, 
we  have  that 

(Xi  ~  A7  (0,oy2GT)  ,  ei  ~  A7(0,  oyl)  . 

Model  (5.1)  can  also  be  written  in  marginal  form  as 


y*~  Mix&Vi), 

where 

Vi  =  ZGZ'  +  a2eI. 


In  long  notation  (Equation  (5.1)),  we  have 


y  = 

yi 

,  X  = 

X1 

,  z  = 

i 

o  o 

° 

Cs  O 

i 

Nxl 

v 

Nxp 

0  0  Zm 

«  = 


Oil 


e  = 


ei 


Oim 


mqX  1 


Nxl 


where  N  =  X!"=i  ni  t°fal  number  of  observations.  Similarly,  this  can  be 

summarized  in  marginal  form  as  y  ~  A f  (X/3,  V)  where 


V  =  a\\  J„,  +  Z.G'Z'  , 

^diag  J  i= 1 

Notice,  in  this  model,  how  the  fixed  effects  are  used  to  model  the  mean  of  y  while 
the  random  effects  govern  the  variance-covariance  structure  of  y .  In  fact,  as  pointed 
out  by  McCulloch  et  al.  (2008),  a  key  reason  for  including  random  effects  in  a  model 
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is  to  simplify  the  otherwise  difficult  task  of  dealing  with  N(N  +  l)/2  unique  elements 
of  V 

Ignoring  constants,  the  log-likelihood  for  the  data  can  be  written  as 
JV  1  m 

(5.3)  2  (0,  y,  0)  =  -- log(y)  -  -  53  log  |  J  +  Z0Z[\ 

i= 1 
m 

-nE*- x'&  i1  +  Z<G,Z.T‘  O'.  -  x>0) . 

e  i= 1 

where  9  is  a  vector  containing  the  unique  elements  of  Gh  Since  G^  is  symmetric,  it  has 
at  most  g(g  +  l)/2  unique  elements;  hence,  6  has  a  maximum  dimension  of  q(q  + 1) / 2. 
In  many  practical  applications,  however,  we  can  restrict  G^  (or  equivalently  G)  to 
simpler  forms  involving  only  a  few  parameters.  For  example,  G*  may  be  a  constant 
multiple  of  the  identity  matrix,  r  J,  which  corresponds  to  uncorrelated  random  effects 
with  constant  variance  ofr;  in  this  case,  9  =  r  has  dimension  one.  This  is  the 
variance-covariance  structure  we  used  for  the  random  coefficients  in  the  mixed  model 
approach  to  P-splines  in  the  previous  chapter. 

One  of  the  most  useful  LMMs  for  repeated  measures  data  is  the  so-called  random 
linear  trend  model : 

(5.4)  ytj  =  (p0  +  a0i)  +  (fix  +  an)  xtj  +  i  =  j  =  1, . . . ,  nh 

where  f30  and  f3\  are  fixed  effects,  {a0i}  are  random  intercepts  distributed  as  A f  (0,  of), 
{ay j}  are  random  slopes  distributed  as  A/”  (0,  crj) ,  and  {%}  are  i.i.d.  random  errors 
distributed  as  A/"(0,  of).  Also,  if  we  let  Cov  {aoy  ay;}  =  chn,  then 

ffi)  On 
On 

We  often  assume  the  random  intercepts  and  slopes  are  independent,  that  is,  of  =  0; 
this  can  be  formally  tested  using  a  likelihood  ratio  test.  Also,  setting  a0i  =  •  •  •  = 
«om.  =  0  or  an  =  •  •  •  =  aym  =  0  yields  the  random  intercept  and  random  slope  models, 
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respectively  (see  Figure  5.1).  Considerable  simplifications  arise  for  the  balanced  cases 
(i.e. ,  when  Ui  =  n  for  all  i).  For  example,  for  a  balanced  random  intercept  model, 
the  matrix  Z  of  Equation  (5.1)  is  just  Z  =  Im  <g)  ln,  where  the  symbol  <g)  denotes 
the  Kronecker  product.  Estimating  the  fixed  effects  and  variance  components  for 
model  (5.4)  is  also  much  simpler  in  the  balanced  case. 


Figure  5.1:  Common  random  coefficient  models  for  grouped  data.  Each  plot  consists 
of  ten  measurements  on  15  subjects — each  of  which  has  a  positive,  linear  trend.  Left: 
Random  intercepts.  Middle:  Random  slopes.  Right:  Random  intercepts  and  slopes. 

5.1.1  Prediction  of  future  observations. 

In  the  literature  for  mixed  models,  prediction  of  future  observations  is  often 
overshadowed  by  the  prediction  of  random  effects.  Nonetheless,  prediction  of  future 
observations  in  mixed  models  is  an  important  topic — see  Jiang  (2007)  for  some 
motivating  examples.  For  LMMs,  there  are  two  kinds  of  predictions  we  can  make 
regarding  a  future  observations:  (1)  predicting  a  new  observation  within  an  existing 
group,  and  (2)  predicting  a  new  observation  in  a  new  group.  Since  longitudinal 
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studies  often  aim  to  make  inference  for  the  whole  population  under  study — and  not 
just  the  groups  sampled — we  will  restrict  our  attention  to  case  (2).  In  particular,  for 
calibration,  we  will  assume  that  a  (single)  new  observation,  denoted  320,  is  independent 
of  the  current  observations  and  does  not  belong  to  an  existing  group. 


5.2  Point  estimation 

In  this  section,  we  discuss  point  estimation  of  xq  in  a  mixed  model  setting. 
Generally,  it  is  difficult  to  compute  the  ML  estimate  of  Xq  due  to  the  complex  nature 
of  mixed  model  likelihoods;  however,  as  shown  below,  some  cases  yield  relatively 
simple  results. 

Consider  the  linear  random  trend  model  (Equation  (5.4)).  For  a  particular  x,  we 

have 


RANDOM 


RANDOM 


y  =  ft  +  fax  +  o0  +  oliX  +  e  =  /j,  (x;  (3)  +  R  (x;  a), 


FIXED  FIXED 

where  /i  ( x ;  /3)  —  E  is  the  (population)  mean  response.  For  simplicity  of 

notation,  let  fi  (x)  =  n(x]/3).  Solving  the  equation  n(x)  =  ft  +  f5\x  for  x,  we 
get  (assuming  /3\  ^  0) 

Kx)  ~  A 

A  ' 

If  jko  denotes  a  random  observation  from  a  normal  distribution  with  mean  h(xq),  then 
E{3^o}  =  fi{x0)  =  A)  +  PiXq,  therefore,  a  natural  estimator  of  x0  is 


(5.5) 


~  3^0  - 

X0  =  - ^ - 

ft 


where  /3  =  ^ft,  ft  j  is  the  EBLUE  of  (3  —  (/30,ft)/.  Notice  how  this  has  the  same 
form  as  the  classical  estimator  (Equation  (3.3))  discussed  in  Section  3.2.  In  fact,  for 
the  balanced  random  intercept  model,  the  classical  estimator  is  still  the  ML  estimator 
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of  x0;  in  other  words,  we  can  compute  the  ML  estimator  of  x0  using  ordinary  least 
squares  with  i.i.d.  normal  errors!  To  see  this,  note  that  for  the  general  case  G 1  =  rl 
and  Zi  =  1  i  (a  column  vector  of  all  ones),  hence,  Vi  =  cr2  (/,,  +  rljl').  Therefore,  we 
can  write 

y,,  ~  JV  {Xij3,  cr2  (. It  +  rljl'i)}  )  i  —  1, . . .  ,m, 

where  X,  is  an  rii  x  2  design  matrix  with  j-th  row  equal  to  X L  =  (1,  x^),  / 3  =  (/30,  Z^)7 
is  a  vector  of  hxed  effects,  cr2  is  the  within-subject  variance,  and  <t2t  is  the  variance  of 
the  random  intercepts.  From  Equation  (5.3),  the  log- likelihood  for  the  data  (ignoring 
constants)  is 

N  1  m 

■&i  (P,  T)  =  -y  log  (a2)  -  ^  ^  !■ 1  +  rl*1il 

2=1 

1  m 

-  2^2  E  (^  -  X^)'  • 

e  2=1 

The  subscript  “I”  is  there  to  remind  us  that  this  is  the  likelihood  for  the  standards, 
the  data  from  the  Erst  stage  of  the  calibration  experiment  (see  Section  1).  Using  the 
following  formulas  (Demidenko,  2013,  pg.  49), 


1 1  +  rljl'l  =  1  +  riiT ; 


I  +  rwr*  =  *- 


1+TliT  ' 


the  log-likelihood  simplihes  to 

N  1  m 

{Pi  aei r)  =  ~ y  log  {^e)  ~  g  lo§  I1  +  n*r) 

2=1 


1  m  / 

^  (y,  -  xt/3)'  /  -  ^r^iii'i )  (; y»  -  x/3) 


2a2 
C  2=1 


1  +  77-jT 

Similarly,  the  log- likelihood  for  the  (single)  unknown  (i.e. ,  the  log- likelihood  for  the 
data  from  the  second  stage  of  the  calibration  experiment)  is 

-2n  {Pi  T,  xo)  =  ~  log  (a2)  -  ^  log  (1  +  r)  -  ^  J  +  ^  (y0  -  Po  -  Pix0)2  ■ 
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From  the  independence  of  y  and  33),  the  log-likelihood  for  the  pooled  data,  denoted 
&  (P,  of,  r,  x0),  is  given  by 


&  (P,  of ,  T,  Xq)  =  jSfi  (p,  CT2,  r)  +  j£fn  (P,  of,  T,  Xq) 


/\r  _i_  I  i  m  i 

— z—  log  (a2)  -  -  log  (1  +  tut)  -  -  log  (1  +  r) 


i=  1 


i 


2 

6  i=  1 


T 


(33> —  A)  -  A^o) 


1  +  riiT ' 
2 


iii;  o>i  -  ^/3) 


2  of  (l  +  r) 

Thus,  the  full  log-likelihood  is  the  sum  of  two  parts:  the  log-likelihood  for  the 
standards,  and  the  log-likelihood  for  the  unknown.  Equating  to  zero  the  partial 
derivative  of  the  full  log-likelihood  with  respect  to  the  parameter  xo  results  in 
Xq  (P)  =  (33>  —  Po)  /Pi-  In  other  words,  for  any  value  of  f3,  x0  ((3)  maximizes  the 
likelihood  with  respect  to  Xq.  Plugging  this  back  into  the  log-likelihood  yields  the 
profiled  log-likelihood 

AT  1  1  171  1 

(P,  a2,  r)  = - —  log  (a2)  -  -  ^  log  (1  +  mr)  -  -  log  (1  +  r) 


i—  1 


2ae2  .  1 
c  2—1 


E  (tt  -  x^y  ( 1 


T 


i  d'Aiyi-XiP), 


1  +  UiT  ' 

Similarly,  equating  the  partial  derivative  of  ££p  (P ,  a2 ,  r) ,  with  respect  to  the 
parameter  of,  to  zero  yields 


(P,  t)  = 


N  +  l 


Y,(yi-x,P)  (i 


2=1 


1  +  UiT  ' 


hi'i  )(yi-xif3y. 


Substituting  this  back  into  the  prohled  log-likelihood  and  simplifying  (i.e.,  cancelling 
common  factors  and  ignoring  constants)  we  get 
N  +  l 

iog<j 

2=1 


■%,  (A  r)  = 


iog  5]  (y*  -  x&)'  ( I 


T 


1  +  UiT 


iii;  o>i  -  XiP) 


-  \  it, iog  ^ + niT ) ~  \ iog  + r)  > 


2=1 
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The  parameters  Xo  and  of  have  been  “profiled  out”,  resulting  in  a  simpler  log- 
likelihood  in  only  p  +  1  parameters.  We  could  continue  in  this  fashion  with  the 
parameter  (3  as  well,  although  it  is  quite  easy  to  see  that  the  value  of  /3  that  maximizes 
S£v  (/3,  r)  is  just  the  usual  GLS  estimator,  /3,  given  by  Equation  (2.19).  Furthermore, 
it  can  be  shown  (Demidenko,  2013)  that,  for  the  balanced  case,  f3  does  not  depend 
on  r  and  in  fact  reduces  to  the  ordinary  LS  estimator  /3  =  (X'X)^1  X'y.  Thus,  the 
ML  estimator  of  Xo  for  the  balanced  random  intercept  model  is  simply 

x0  =  x0  (  (3\  =  — £ - , 

v  7  Pi 


where 


A  = 


EILi  Eg*  (Xij  -  x)  (Vij  -  y) 

(r  -  —  f)2 


A)  =  y  -  A®- 


In  general,  the  ML  estimator  x0  of  Xo  is  difficult  to  obtain  analytically.  For 
example,  in  a  random  slope  model,  jho  ~  A/"  (/do  +  AlXo,  +  cr^ ),  thus,  the  unknown 

Xo  affects  both  the  mean  and  variance  of  3^o-  A  reasonable,  and  more  practical 
approach,  is  to  proceed  as  before,  that  is,  by  solving  the  equation  y0  =  /x(x0)  for  the 
unknown  xq.  As  shown  above,  for  the  balanced  random  intercept  model,  this  leads 
to  the  ML  estimate. 

We  should  point  out  that,  in  general,  the  EBLUE  of  / 3  (Equation  (2.20))  depends 
on  the  estimated  variance  components  through  V.  Furthermore,  recall  that  for  the 
linear  calibration  problem,  the  bias-adjusted  ML  estimator  of  of  (the  only  variance 
component)  is 


1  n  2  1  n-\-m 

(5.6)  ne2  =  — -  J2  ( 3>i  -  A)  -  Axi)  +  — -  (tt  -  3^o)2  • 

2=1  z=n+l 

The  hrst  term  in  Equation  (5.6)  is  the  usual  unbiased  estimator  of  of,  and  the  second 
term  is  just  the  sample  variance  of  the  m  unknowns  Toi  i  •  •  • ,  Tom-  If  there  is  only 
one  unknown,  then  the  second  term  in  Equation  (5.6)  is  zero!  Hence,  for  a  single 
unknown,  the  ML  estimator  of  the  variance  of  (adjusted  for  bias)  is  unaffected.  We 
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could  extrapolate  by  making  a  similar  argument  for  calibration  in  mixed  models. 
That  is,  if  only  a  single  observation,  34),  is  available  from  the  second  stage  of  the 
calibration  experiment,  then  the  ML  estimates  of  the  variance  components  (adjusted 
for  bias)  should  be  unaffected — this  would  not  be  the  case  for  replicate  unknowns 
sharing  a  single  unknown  xq.  In  this  chapter,  we  only  concern  ourselves  with  the  case 
of  a  single  unknown;  thus,  we  make  the  argument  that  the  usual  estimates  of  the 
variance  components  are  valid  for  calibration  inference. 

For  illustration,  we  generated  n  =  30  observations  for  each  of  m  =  15  groups 
from  a  random  intercept  model  with  fixed  effects  (3  =  (0,  l)7  and  variance  components 
(Tq  =  0.01  and  of  =  0.001.  A  spaghetti  plot  of  the  data  is  shown  in  Figure  5.2;  notice 
the  apparent  linear  trend  with  varying  intercepts.  For  a  new  observation,  y0  =  0.75, 
the  estimate  of  the  corresponding  unknown  Xo  is  Xo  — 

Since  these  data  are  balanced,  xq  is  also  the  ML  estimate 


(o.75  -  A))  /3i  =  0.7819. 

of  Xq. 
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Figure  5.2:  Simulated  random  intercept  data. 


5.3  Wald  interval 

The  delta  method  provides  the  simplest  approach  to  computing  confidence 
intervals,  as  long  as  the  quantity  of  interest  is  a  function  of  random  variables  that 
are  at  least  asymptotically  normal.  Under  mild  regularity  conditions  often  satisfied 
in  practice,  the  ML  estimator  (3  of  (3  is  consistent  and  asymptotically  normal  with 
mean  vector  (3  and  asymptotic  variance-covariance  matrix  (X'V~1X)  1  (Pinheiro, 
1994).  Furthermore,  by  assumption,  y0  is  distributed  as  J\f  {(3q  +  (3\Xo,ol),  where  of 
denotes  the  variance  of  3^o  which,  depending  on  the  random  effects  structure  of  the 
model,  may  involve  the  unknown  Xq.  For  instance,  if  R  (x;  ck)  =  a  +  e  (i.e.,  a  random 
intercept  model),  then  of  =  of  +  of,  whereas  if  R(x]ct )  =  ax  +  e  (i.e.,  a  random 
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slope  model),  then  erg  =  Xq cF)  +  of  which  depends  on  x0.  Although  our  attention  is 
restricted  to  LMMs  in  this  dissertation,  the  simplicity  of  the  Wald-based  approach 
extends  to  nonlinear  mixed-effects  models  (NLMMs)  as  well.  For  an  introduction  to 
NLMMs,  see  Pinheiro  (1994)  and  Pinheiro  and  Bates  (2009). 

The  Wald  statistic,  denoted  W,  is  essentially  a  generalization  of  the  usual  Z- 
statistic.  Here  it  is  the  point  estimator,  normalized  by  an  estimate  of  its  standard 
error  which  we  obtain  using  Taylor’s  theorem.  By  assumption,  for  large  enough  N, 
W  =  To/Var  {xo}  has  a  x2(l)  distribution.  As  pointed  out  by  Harrell  (2001,  p.  184), 
most  statistical  packages  treat  \/VV  as  having  a  ^-distribution  with  degrees  of  freedom 
df  (rather  than  a  standard  normal  distribution),  however,  there  is  usually  no  basis 
for  this  outside  of  the  ordinary  linear  model  (Gould,  1993). 

We  discussed  the  delta  method  for  calibration  in  Section  3.3.2,  where  we  used  a 
first-order  Taylor  series  expansion  to  approximate  the  variance  of  xq.  The  same  basic 
procedure  holds  here  for  LMMs  except  now  Var  {Vo}  contains  additional  variability 
attributed  by  the  random  effects.  As  a  simple  example,  we  consider  the  balanced 
random  intercept  model.  For  this  case,  we  have  shown  that  the  ML  estimator  of  xq 
is  given  by  x0  =  ^Vo  —  Poj  /Pi-  We  should  make  clear,  however,  that  even  though 
(3  does  not  depend  on  the  variance  components,  its  variance-covariance  matrix  does! 
Therefore,  we  still  need  to  compute  the  variance-covariance  matrix  of  / 3  from  a  mixed 
model;  for  example,  the  variance-covariance  matrix  obtained  from  the  SAS/STAT  (SAS 
Institute  Inc.,  2011)  software’s  PR0C  MIXED  procedure  or  the  lme  function  from  the  R 
package  nlme  (Pinheiro  et  ah,  2013).  The  variance-covariance  matrix  of  ( Vo>/3)  is 


(5.7) 


E 


Var  {Vo} 

0 

°o  0 

0 

Var  |  (3 1 

o  (x'v^xy1 

Since  Vo  is  independent  of  V ,  it  is  also  independent  of  (3.  Recall  that  our  point 
estimate  has  the  form  x  =  /r_1  ( y\(3 ).  Let  (j/;/3)  and  the  vector- valued  function 
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H2  1  (y;  (3)  be  the  partial  derivatives  of  /i-1  with  respect  to  the  parameters  y  and 
/3,  respectively.  Our  point  estimator  is  given  by  /i^1  (3^j ,  where  3^o  is  a  new 

observation  and  (3  is  the  EBLUE  of  (3.  Using  a  first-order  Taylor-series  expansion, 
an  approximate  variance  for  xq  is  given  by 


(5.8)  Var{x0}=  /V  (Vo;  3) 


ao  + 


/u1  (wj)  (X'r1!)  1  ^ 


Of  course,  to  obtain  Var{x0},  we  need  to  replace  dg  and  V  in  Equation  (5.8)  with 

their  corresponding  estimates  <7q  and  V,  respectively.  Once  we  have  Var{x0},  an 

approximate  100(1  —  a)%  conhdence  interval  for  xq  is  given  by  xo±£i-a/2\/var  {i?o}- 

As  pointed  out  by  Davidian  and  Giltinan  (1995,  p.  283),  the  Erst  term  in 

Equation  (5.8)  usually  (but  not  necessarily)  dominates  the  second  term,  leaving 

2 

erg.  For  example,  consider  the 
random  intercept  model  discussed  earlier.  For  this  model,  we  have 


the  cruder  approximation  Var{x0}  =  y1 1  uy0;  (3j 


y  1(y,(3 )  =  (y-  Po)/Pu 

1  (y,P)  =  i /Pu 

thus,  for  the  balanced  case,  the  crude  approximation  turns  out  to  be  Varcmde  {x0}  = 
of  (1  +  r)  / f3\.  In  general,  the  approximate  variance  (Equation  (5.8))  can  be  difficult 
to  compute  by  hand  leading  to  widespread  use  of  the  cruder  formula.  However,  given 
the  ease  with  which  complex  computations  can  be  carried  out  numerically  (and  with 
great  accuracy),  it  is  good  practice  to  compute  and  use  both  terms.  In  Section  5.8, 
we  illustrate  with  a  real  data  example  how  easily  this  can  be  done  using  the  software 
R  (in  particular,  see  Example  5.8.1). 

For  the  simulated  data  in  Figure  5.2,  we  get  an  approximate  standard  error  (i.e. , 
using  Taylor’s  theorem)  of  0.1  which  produces  a  Wald-based  calibration  interval  for 
xq  of  (0.5859,0.9778).  Compare  this  to  the  cruder  interval  (0.5915,0.9722)  obtained 
by  using  Varcrude  {iro}  in  place  of  the  full  Taylor-series  estimate  (5.8). 
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Unfortunately,  one  of  the  difficulties  with  inference  in  LMMs  is  that 


Var 


w 


=  [X'V^X 


involves  V,  but  does  not  take  into  account  its  variability.  In  other  words,  Var 


underestimates  the  true  variance  of  (3  (see,  for  example,  McCulloch  et  al.  (2008,  pp. 
165-167)).  Thus,  any  inference  relying  on  Var  j/3j,  including  the  Wald  interval  for 
calibration  just  discussed,  may  be  misleading.  An  alternative  is  to  use  the  so-called 
parametric  bootstrap  to  compute  a  bootstrap  estimate  of  the  variance-covariance 
matrix  (3  to  use  in  place  of  Var  j/3  j  in  Equation  (5.7)  (see  steps  (l)-(4)  of  Algorithm  2 
on  page  85). 


5.4  Inversion  interval 

We  can  also  obtain  a  confidence  interval  for  the  unknown  x0  by  inverting  an 
asymptotic  prediction  interval  for  Vo-  Let  Xo  have  the  same  form  as  the  i-th  row  of 
X,  but  with  Xij  replaced  with  x0-  For  example,  if  /i  (ipy.  (3)  =  /30  +  f3\Xij  +  then 
X0  =  (l,x0,xl)'. 

For  brevity,  let  po  =  p{xq\(3),  po  =  /i  (xo;/3^J,  and  po  —  p  (xq]  {3^J  where,  as 
before,  (3  and  (3  denote  the  BLUE  and  EBLUE  of  f3,  respectively.  A  new  observation, 
Vo  say,  with  unknown  xo,  is  distributed  as  A7(/i0,  ctq).  Clearly,  Vo  —  is  a  normally 
distributed  random  variable  with  expectation  zero  (since  E{Vo}  =  E  {/70}  =  p0). 
Also,  note  that  Vo  and  Po  are  independent,  hence,  Cov{Vo,/L)}  =  0.  It  therefore 
follows  that  the  statistic 


/ g  g  v  ^  =  _ 3^0  ~  P0 _  _  _ 3^0  ~  PO _ 

-/Var  {Vo}  +  Var  {p0}  ^ a2  +  X'Q  (X'V~lX')~l  XQ 

is  a  pivotal  quantity  that  has  an  asymptotic  standard  normal  distribution  or, 
equivalently,  Z 2  ~  \i  (a  Chi-squared  distribution  with  a  single  degree  of  freedom). 
The  key  here  is  that  po  is  a  linear  function  of  V-  For  example,  for  a  balanced  random 
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intercept  model,  Equation  (5.9)  reduces  to 

^  _ _ yo  —  A)  —  Pixq _ 

\J +  <rl  +  X'Q  [ X '  ®  Jn  +  a/INyl  X] _1  X0  ’ 

where  Jn  =  lnl^  is  an  n  x  n  vector  of  all  ones. 

In  practice,  <Tq,  V,  and  x0  are  usually  unknown  and  need  to  be  estimated  from 
the  data.  In  such  cases,  an  approximate  pivot  (essentially  a  Wald  statistic),  denoted 
Q,  can  be  obtained  by  replacing  erg,  V,  and  x0  in  the  above  equation  with  their 
respective  estimates  Og,  V,  and  xq.  This  suggests  an  approximate  100(1  —  a)% 
confidence  interval  for  Xq  of 

(5.10)  JCai(x)  =  {a;  :  za/2  <  Q  <  ^i_a/2}  , 

where  za/2  =  ^i-a/2  denote  the  a/2  and  1  —  a/2  quantiles  of  a  standard 
normal  distribution,  respectively.  Similar  to  the  approximate  predictive  pivot 
(Equation  (3.14)),  it  is  unlikely  that  Equation  (5.10)  will  yield  closed-form  solutions, 
thus,  the  solution  must  be  obtained  numerically.  For  the  simulated  data  example,  a 
95%  inversion  interval  based  on  Equation  (5.10),  corresponding  to  yo  =  0.75,  is  given 
by  (0.5859, 0.9779),  which  is  very  similar  to  the  Wald-based  intervals  obtained  earlier. 

5.5  Parametric  bootstrap 

In  Section  3.3.3,  we  discussed  how  to  calculate  calibration  intervals  based  on 
the  nonparametric  bootstrap.  A  crucial  assumption  for  the  ordinary  nonparametric 
bootstrap,  however,  is  that  the  data  are  independent;  for  reasons  discussed  earlier, 
this  assumption  is  typically  not  valid  for  grouped  data.  Nonetheless,  a  different 
kind  of  bootstrap,  called  the  parametric  bootstrap,  has  shown  promise  as  a  serious 
inferential  tool.  For  examples,  see  McCulloch  et  ah  (2008,  pg.  342)  and  Efron 
(2011).  When  applicable,  the  parametric  bootstrap  typically  gives  answers  similar 
to  a  Bayesian  analysis  with  uninformative  priors,  however,  it  is  much  faster  than 
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MCMC  simulations  (bootstrap  simulations  usually  only  require  a  few  thousand 
iterations  whereas  traditional  MCMC  simulations  may  require  tens,  or  even  hundreds 
of  thousands,  of  iterations).  The  parametric  bootstrap  essentially  entails  sampling 
from  the  fitted  model  itself,  rather  than  sampling  (with  replacement)  from  the 
data.  For  controlled  calibration  in  a  mixed  model  setting,  we  propose  the  following 
algorithm  (essentially  a  parametric  version  of  Algorithm  1  based  on  the  LMM  instead 
of  the  ordinary  LM): 

Algorithm  2:  Parametric  bootstrap  for  controlled  calibration  in  LMMs. 

for  r  =  1  to  R  do 

(1)  generate  q  new  values  of  the  random  effects,  denoted  a*,  from  a  J\f  ^0,  G 
distribution; 

(2)  generate  N  new  errors,  denoted  e*,  from  a  A/"(0,  of/)  distribution; 

(3)  set  y *  =  X/3  +  Zol*  +  e*; 

(4)  update  the  original  model  using  y *  as  the  response  vector  to  obtain  /3*; 

(5)  generate  y*)r  from  aA(!/o,^)  distribution; 

(6)  compute  x^r  =  y~l  (y£r;/3*). 

end 

Note  that  only  steps  (5)  and  (6)  are  specific  to  calibration.  Similar  parametric 
bootstrap  schemes  have  also  been  proposed  for  mixed  models.  For  example,  we 
can  condition  on  the  current  values  of  the  random  effects  by  ignoring  step  (1)  of 
Algorithm  2  and  using  the  current  EBLUP,  a,  in  place  of  ot *.  Semiparametric  variants 
of  Algorithm  2  that  involve  sampling  directly  from  the  EBLUP  and  residuals  have 
also  been  proposed,  but  Morris  (2002)  considers  this  to  be  bad  practice  because  it 
consistently  underestimates  the  true  variation  in  the  data. 

We  applied  Algorithm  2  to  the  simulated  data  from  Figure  5.2.  A  histogram  of 
the  R  =  9,  999  bootstrap  replicates  of  xo  is  shown  in  Figure  5.3.  Not  surprisingly, 
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the  distribution  is  reasonably  symmetric  and  approximately  normal;  the  normal  Q-Q 
plot  also  confirms  this.  These  bootstrap  replicates  were  used  to  produce  the  last 
two  confidence  intervals  in  Table  5.1.  For  comparison,  we  have  also  included  the 
calibration  intervals  computed  in  the  previous  sections.  The  results  are  all  very 
similar  and  there  is  little  reason  here  for  choosing  one  interval  over  another.  The 
Wald-based  intervals  are  symmetric,  but,  as  can  be  seen  from  Figure  5.3,  symmetry 
is  not  unrealistic  for  this  example  (this  is  not  the  case  for  the  example  given  in 
the  next  section).  The  inversion  interval  is  not  symmetric  about  Xo,  as  well  as  the 
bootstrap  intervals,  however,  the  bootstrap  approach  has  the  advantage  of  providing 
an  estimate  of  the  entire  sampling  distribution  of  xq.  It  should  be  noted,  though,  that 
the  parametric  bootstrap  assumes  that  the  model  specified  for  the  data  is  correct!  If, 
however,  the  data  were  not  normal,  then  all  of  these  intervals  would  likely  produce 
misleading  results.  In  the  next  section,  we  discuss  a  potential  remedy  that  can  be  used 
for  non-Gaussian  LMMs,  that  is,  LMMs  that  do  not  assume  a  specific  distribution 
for  the  random  effects  or  the  errors. 

Table  5.1:  Approximate  95%  calibration  intervals  for  the  simulated  balanced  random 
intercept  example.  The  intervals  based  on  the  parametric  bootstrap  are  labeled  (PB). 


Interval 

Estimate 

Lower  2.5% 

Upper  97.5% 

Length 

SE 

Wald 

0.7819 

0.5859 

0.9778 

0.3920 

0.0999 

Crude  interval 

0.7819 

0.5915 

0.9722 

0.3807 

0.0971 

Inversion 

0.7819 

0.5859 

0.9779 

0.3920 

NA 

Normal  (PB) 

0.7819 

0.5873 

0.9742 

0.3870 

0.0987 

Percentile  (PB) 

0.7819 

0.5895 

0.9789 

0.3893 

0.0987 

Bootstrap  value 


Theoretical  quantile 


Figure  5.3:  Bootstrap  distribution  of  Xq  obtained  using  Algorithm  2.  The  dotted  red 
curve  represents  a  normal  distribution  with  mean  xq  and  standard  deviation  estimated 
from  the  bootstrap  replicates  x*Qr.  The  vertical  black  line  indicates  the  position  of  xQ. 


5.5.1  Parametric  bootstrap  adjusted  inversion  interval. 

Although  we  favor  the  bootstrap  confidence  intervals  obtained  directly  from  the 
R  bootstrap  replicates  of  xq,  researchers  are  likely  more  familiar  with  the  inversion 
and  Wald-based  intervals  discussed  in  the  previous  two  sections.  These  intervals, 
however,  use  the  quantiles  from  a  standard  normal  distribution  (i.e.,  rely  on  normal 
approximations).  The  parametric  bootstrap  can  be  used  to  improve  upon  these 
intervals  by  replacing  the  standard  normal  quantiles  with  more  accurate  ones.  For 
instance,  for  the  inversion  interval,  at  each  run  in  Algorithm  2,  we  compute 

Vo~  p  [x0;  P*) 

Q *  =  v  y 

Ja%*  +  X'0 
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where  the  denominator  is  evaluated  at  x0  =  x0.  As  a  result,  we  obtain  the  R  bootstrap 
values  Q*.  Let  and  r)\_a/2  denote  the  sample  a/2  and  1  —  a/2  quantiles  of  Q*, 
respectively.  A  bootstrap  adjusted  inversion  interval  for  xq  is  then  given  by 

(5-11)  «^cal (X)  =  (x  :  1a/2  <  Q  <  71-0/2}  • 

We  illustrate  this  on  the  bladder  volume  example  in  Section  5.8.  A  similar  adjustment 
can  also  be  made  for  the  Wald-based  interval  as  well;  this  is  very  similar  to  the 
studentized  bootstrap  procedure  outlined  in  steps  (6)-(7)  of  Algorithm  1. 

5.6  Distribution  free  calibration  interval 

For  certain  cases,  we  can  easily  obtain  an  asymptotic  prediction  interval  for 
a  future  observation  that  does  not  require  normality  for  the  random  effects  or 
errors.  These  intervals  are  called  distribution-free  prediction  intervals;  for  details, 
the  interested  reader  is  pointed  to  Jiang  and  Zhang  (2002)  or  Jiang  (2007).  This 
suggests  the  possibility  of  a  distribution-free  calibration  interval  for  xo  by  inverting  a 
corresponding  distribution-free  prediction  interval. 

We  consider  only  the  case  of  standard  LMMs.  Following  Jiang  and  Zhang  (2002) 
and  Jiang  (2007),  an  LMM  is  said  to  be  standard  if  each  Z,  of  Equation  (5.2)  consists 
of  only  0’s  and  l’s,  such  that  each  row  contains  exactly  one  1  and  each  column  has  at 
least  one  1.  The  random  intercept  model  (balanced  or  unbalanced  case)  is  standard 
in  this  sense;  the  random  slope  model,  however,  is  not. 

The  method  for  standard  LMMs  turns  out  to  be  quite  simple.  Compute 
the  ordinary  LS  estimate  (3  =  (X'Xy1  X'y,  then  obtain  the  residual  vector  as 
e  =  y-Xp.  Denote  the  a/2  and  1  —  a/2  quantiles  of  the  residuals  as  eQ/2  and  ei_a/2, 
respectively.  Then,  a  distribution- free  prediction  interval  for  a  new  observation,  yo, 
with  asymptotic  coverage  probability  1  —  a  is  given  by  [J2(x 0)  +  ea/2,  JI(x 0)  +  ei_Q/2] , 
where  Jl(x 0)  is  the  empirical  best  predictor  of  y q.  For  example,  for  a  random  intercept 


model,  the  interval  in  question  is  simply  \J30  +  /3iXq  +  ea/2,  /%  +  (3\Xo  +  e\ -a/2j  •  If  Vo 
is  observed  and  Xq  is  the  unknown,  then  this  formula  can  be  inverted  (in  closed-form) 
to  produce  an  asymptotic  100(1  —  a)%  distribution-free  confidence  interval  for  xq  of 

jy°  ~  ei— Q/2)  ~  Po  (: Vo  ~  e-a/ 2)  ~  A) 

L  A  ’  A 

Using  Equation  (5.12),  a  95%  distribution- free  calibration  interval  for  the  simulated 
random  intercept  example  is  (0.6121,0.9852).  ffowever,  since  these  data  are  normal 
(we  know  because  we  generated  the  data),  the  intervals  given  in  Table  5.1  are  probably 
more  accurate. 


5.7  Simulation  study 

To  illustrate  the  practical  performance  of  the  confidence  interval  procedures 
discussed  in  Sections  5. 3-5. 5,  we  ran  a  small  simulation  study  in  which  (:%,  3%), 
i  =  1,  2, . . . ,  30,  j  =  1,  2, . . . ,  20,  were  generated  from  both 

y%j  —  {oiQj  +  A))  +  ( Olij  +  (5\ )  Xij  +  €ij 

and 

y%j  —  («o j  +  P  0)  +  («i  j  +  (3 1)  +  f32x2ij  +  €ij. 

In  the  hrst  model,  (/90,  (i\)  =  (0,2),  a0j  ~  A/"  (0, 0.001),  a>\ j  ~  J\f  (0,0.05), 
and  6ij  ~  N  (0, 0.001).  Similarly,  in  the  second  model,  (/30,  /%,  62)  =  (0,3, —1), 
aoj  ~  JV  (0,  0.0001),  aij  ~  J\f  (0,0.05),  and  ~  (0,0.001).  I11  both  models,  the 

random  variables  {aoj},  {aij},  and  {eij}  are  mutually  independent.  We  computed 
separate  95%  calibration  intervals  using  each  method  corresponding  to  five  different 
unknowns:  (To  =  { 0,  0.5, 1, 1.5,  2}.  An  example  data  set  from  each  model  is  displayed 
in  Figure  5.4. 

Due  to  the  large  computational  time  involved  in  bootstrapping  and  fitting  mixed 
models,  the  Monte  Carlo  sample  size  was  limited  to  A^  =  1,000  and  the  bootstrap 


sample  size  was  set  to  R  —  999.  Furthermore,  because  we  used  a  confidence  level 
of  1  —  a  =  0.95,  the  standard  deviation  of  the  coverage  probability  is  approximately 
v/0.95  (1  -  0.95) /1000  =  0.007;  thus,  we  only  report  two  decimal  places  of  precision 
in  Tables  5.2  and  5.3. 


Figure  5.4:  Scatterplots  of  simulated  data.  Left:  Data  from  a  balanced  random 
intercept  and  slope  model.  Right :  Data  from  a  balanced  random  intercept  and  slope 
model  with  a  fixed  quadratic  term.  The  mean  response  is  shown  as  a  solid  black  curve 
and  the  dashed  lines  indicate  the  positions  of  the  true  unknowns  (x0,  3^o)  • 


As  expected,  all  four  procedures  perform  well  for  the  balanced  random  intercept 
and  slope  model,  with  coverage  estimates  close  to  0.95  (see  Table  5.2).  For  the  model 
with  the  quadratic  term,  the  methods  still  performed  well,  however,  the  Wald-based 
interval  performed  much  worse  for  3^o  —  2,  with  coverage  probability  well  below  0.95 
(see  Table  5.3).  A  decrease  in  coverage  was  anticipated  since  we  do  not  expect  the 
sampling  distribution  of  the  inverse  estimator,  xq,  to  be  approximately  normal  (or 
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Table  5.2:  Coverage  probability  and  length  estimates  of  calibration  intervals 
corresponding  to  various  values  of  33)  hi  a  balanced  random  intercept  and  slope 
model.  The  Monte  Carlo  sample  size  was  N  =  1,000  and  the  bootstrap  sample 
size  was  R  =  999. 


33)  = 

0 

0.5 

1 

1.5 

2 

Coverage 

Wald 

0.95 

0.94 

0.93 

0.94 

0.93 

Inversion 

0.95 

0.95 

0.94 

0.93 

0.95 

Percentile  bootstrap 

0.95 

0.94 

0.93 

0.93 

0.94 

Adjusted  inversion 

0.96 

0.94 

0.94 

0.94 

0.96 

Length 

Wald 

0.09 

0.14 

0.24 

0.35 

0.45 

Inversion 

0.09 

0.14 

0.24 

0.34 

0.45 

Percentile  bootstrap 

0.09 

0.14 

0.24 

0.34 

0.45 

Adjusted  inversion 

0.09 

0.14 

0.24 

0.36 

0.47 

Table  5.3:  Coverage  probability  and  length  estimates  of  95%  calibration  intervals 
corresponding  to  various  values  of  33)  hi  a  balanced  random  intercept  and  slope  model 
containing  a  quadratic  term.  The  Monte  Carlo  sample  size  was  N  =  1,000  and  the 
bootstrap  sample  size  was  R  =  999. 


33)  = 

0 

0.5 

1 

1.5 

2 

Coverage 

Wald 

0.95 

0.96 

0.94 

0.94 

0.87 

Inversion 

0.96 

0.95 

0.94 

0.93 

0.93 

Percentile  bootstrap 

0.94 

0.93 

0.93 

0.93 

0.92 

Adjusted  inversion 

0.94 

0.94 

0.94 

0.93 

0.92 

Length 

Wald 

0.04 

0.08 

0.17 

0.35 

1.09 

Inversion 

0.04 

0.08 

0.16 

0.38 

1.36 

Percentile  bootstrap 

0.04 

0.08 

0.16 

0.37 

0.63 

Adjusted  inversion 

0.04 

0.08 

0.17 

0.37 

1.37 
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even  symmetric)  in  this  portion  of  the  curve.  Surprisingly,  the  inversion  interval  seems 
to  perform  as  well  as  the  intervals  based  on  the  parametric  bootstrap  with  respect  to 
coverage.  However,  the  interval  based  on  the  percentile  bootstrap  has  the  advantage 
of  providing  an  estimate  of  the  entire  sampling  distribution  of  Xq,  rather  than  just  a 
conhdene  interval,  as  well  as  good  width.  Also,  the  Wald-based  and  inversion  intervals 
do  not  account  for  the  uncertainty  in  the  estimated  variance  components  and  rely  on 
large-sample  sizes,  therefore,  these  intervals  would  not  be  expected  to  perform  as  well 
on  smaller  data  sets.  A  more  extensive  simulation  study  allowing  for  different  sample 
size  configurations  would  be  necessary  to  further  substantiate  these  claims. 

5.8  Bladder  volume  example 

In  this  section,  we  discuss  an  example  involving  a  real  dataset  taken  from  Brown 
(1993)  where  the  author  states: 

“A  series  of  23  women  patients  attending  a  urodynarnic  clinic  were 
recruited  for  the  study.  After  successful  voiding  of  the  bladder,  sterile 
water  was  introduced  in  additions  of  10,  15,  and  then  25  ml  increments 
up  to  a  final  cumulative  total  of  175  ml.  At  each  volume  a  measure  of 
height  (H)  in  mm  and  depth  (D)  in  mm  of  largest  ultrasound  bladder 
images  were  taken.  The  product  H  x  D  was  taken  as  a  measure  of  liquid 
volume...” 

The  product  of  H  and  D  is  plotted  against  true  volume  in  Figure  5.5  for  each  of  the  23 
subjects.  As  can  be  seen  from  this  plot,  each  subject  has  a  slightly  nonlinear  trend 
and  there  are  some  missing  observations  (i.e.,  the  data  are  unbalanced). 
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Figure  5.5:  Scatterplot  of  the  bladder  volume  data. 


Finding  an  appropriate  random  effects  structure  for  the  model  can  be  difficult. 
An  informal,  but  simple,  approach  is  to  fit  the  same  model  to  the  data  for  each  subject 
and  compare  the  estimated  coefficients.  To  this  end,  we  fit  a  simple  quadratic  model, 
fj,(x)  =  (3 o  +  /3 ix  +  j32x 2,  to  each  of  the  23  subjects  measurements.  Plots  of  the 
individual  95%  confidence  intervals  are  displayed  in  Figure  5.6.  We  should  point  out 
that  we  used  orthogonal  polynomials  to  obtain  each  fit;  this  was  done  to  remove 
the  correlation  between  the  estimated  coefficients.  Clearly,  the  intercept  and  slope 
parameters,  /30  and  f3i,  vary  greatly  between  subjects,  while  (32  remains  relatively 
constant  throughout.  This  suggests  an  LMM  with  random  effects  for  the  intercept 
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and  linear  terms.  In  particular,  we  used  the  following  random  coefficient  model: 

(5.13)  HDy  =  A)  +  a0 i  +  (/?i  +  au)  volume^-  +  ^volume?-. 

Notice  this  is  just  the  linear  random  trend  model  (Equation  (5.4))  with  an  additional 
quadratic  term.  Figure  5.7  shows  the  subject-specific  fits  based  on  model  (5.13). 
Clearly,  the  model  does  a  good  job  describing  the  data.  Figure  5.7  also  shows  how 
each  subject  deviates  from  the  overall  fitted  mean  response  j2(x)  =  /3q  +  (3\x  +  fox2. 


Figure  5.6:  95%  confidence  intervals  for  regression  coefficients  from  subject-specific 
fits  to  the  bladder  volume  data.  Note  that  we  used  orthogonal  polynomials  in  order 
to  remove  the  correlation  between  the  estimated  coefficients. 
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Figure  5.7:  Scatterplot  of  the  bladder  volume  data  with  fitted  mean  response.  Left: 
Scatterplot  of  the  bladder  volume  data  with  fitted  mean  response  (solid  black  curve). 
The  horizontal  red  arrows  indicate  the  positions  of  the  unknown  HD0  =  70  mm2  and 
the  point  estimate  volumeo  obtained  from  inverting  the  fitted  mean  response.  Right : 
Fitted  curves  from  a  quadratic  model  fit  to  the  bladder  volume  data.  The  model 
includes  (uncorrelated)  random  effects  for  the  intercept  and  linear  term  for  each 
subject. 


Suppose  we  obtain  a  new  observation,  HD0  =  70  mm2,  for  which  the  true  volume 
is  unknown.  To  estimate  the  unknown  volume,  denoted  volumeo,  we  proceed  as 
discussed  at  the  end  of  Section  5.2.  In  particular,  we  solve  the  equation 

fi  (volumeo)  =  A)  +  /5i volumeo  +  volume^  =  70  mm2 

for  volume0  using  the  quadratic  formula.  The  point  estimate  obtained  is  volume0  = 
93.4124  ml  (see  Figure  5.7).  Since  volumeo  is  not  the  ML  estimate,  we  cannot  compute 
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an  approximate  ML  interval  as  we  could  for  the  balanced  random  intercept  example. 
However,  the  following  snippet  of  R  code  shows  how  to  use  the  well-known  car  package 
(Fox  and  Weisberg,  2011)  to  compute  the  approximate  standard  error  based  on  the 
first-order  Taylor  series  estimate  given  in  Equation  (5.8).  Note  that  our  model  has 
Var  {(To}  =  ctq  +Xq of  +  of  where  and  a\  are  the  variances  of  the  random  intercept 
and  linear  terms,  respectively.  If  the  random  effects  were  correlated,  there  would  be 
an  additional  term  X0O01  where  Uoi  =  Cov  {«oi,  can}- 

R  Example  5.8.1. 

##  Obtain  fitted  model 

mod  <-  lme(HD  ~  volume  +  I (volume~2) ,  random  =  pdDiag ("volume) , 
data  =  Bladder) 

b  <-  as. numeric (fixef (mod))  #  vector  of  fixed  effects 
##  Set  up  variance-covariance  matrix  from  Equation  (5.7) 

var.yO  <-  getVarCov(mod) [1 ,  1]  +  x0.est~2  *  getVarCov(mod) [2 ,  2]  + 
summary (mod) $sigma~2 
covmat  <-  diag(4) 
covmat[l:3,  1:3]  <-  vcov(mod) 
covmat [4 ,  4]  <-  var . yO 

##  Call  the  deltaMethodO  function  from  package  car 
dm  <-  car :  :  :  deltaMethod(c  (bO  =  b[l],  bl  =  b  [2]  ,  b2  =  b  [3]  ,  yO  =  70), 
g  =  " (-bl+sqrt(bl~2-4*b2*(b0-y0)))/(2*b2) " ,  vcov.  =  covmat) 
rownames(dm)  <-  "" 
dm 

##  Estimate  SE 
##  93.41  19.63 


The  resulting  standard  error  is  19.63  which  yields  an  approximate  (Wald-based) 
95%  confidence  interval  for  volumeo  of  (54.94, 131.89).  For  comparison,  we  computed 
the  same  interval  assuming  the  data  are  cross-sectional,  that  is,  by  ignoring  the 
grouped  structure  of  the  data  and  using  the  methods  discussed  in  Section  3.3.2. 
The  resulting  interval  is  (56.43, 130.77)  which,  as  expected,  is  narrower  than  the  one 
previously  obtained. 
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Obtaining  the  inversion  interval  (Equation  (5.10))  is  less  straightforward;  we  have 
to  write  our  own  prediction  function  in  R  that  will  also  return  the  standard  errors 
of  the  fitted  values.  Example  5.8.2  shows  the  minimal  R  code  necessary  to  obtain 
Jc ai(x)  for  the  bladder  volume  example.  The  first  line  simply  extracts  the  point 
estimate,  volumeo,  obtained  in  Example  5.8.1.  The  next  few  lines  of  code  define  a 
new  prediction  function,  predFun,  that  simply  calls  the  built-in  prediction  function, 
but  additionally  returns  the  standard  errors  of  the  fitted  values.  The  last  block  of 
code  finds  the  roots  to  the  equation  Q2  —  z\_a/2  =  0  where  Q  is  the  approximate 
pivot  described  in  Section  5.4.  The  resulting  interval,  (58.24, 137.05),  is  only  slightly 
larger  than  the  interval  based  on  the  delta  method.  Notice  it  is  also  not  symmetric 
about  the  point  estimate  volume0  =  93.41  ml  which,  given  the  nonlinear  trend  and 
increasing  variation  in  the  data,  is  more  realistic. 

R  Example  5.8.2. 

##  Extract  point  estimate  from  previous  example 
xO.est  <-  dm [ ["Estimate"] ] 

##  Function  to  compute  fitted  values  and  their  standard  errors 
predFun  <-  function (x)  { 
z  <-  list (volume  =  x) 

fit  <-  predict (mod,  newdata  =  z,  level  =  0) 
se.fit  <-  sqrt (diag(cbind(l ,  unlist (z) ,  unlist (z)~2)  %*0/0 
mod$varFix  t(cbind(l,  unlist (z),  unlist (z) ~2) )) ) 
list(fit  =  fit,  se.fit  =  se.fit) 

} 

##  Invert  approximate  prediction  bounds  (numerically) 
invBounds  <-  function (x)  { 
z  <-  list (volume  =  x) 

(70  -  predFun(x)$f it) ~2/(var .y0  +  predFun(x)$se.fit~2)  - 
qnorm(0 . 975) ~2 

} 

c (uniroot (invBounds,  interval  =  c(10,  xO.est),  tol  =  le-10)$root, 
uniroot (invBounds ,  interval  =  c(x0.est,  175),  tol  =  le-10)$root) 

##  [1]  58.24  137.05 
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In  addition,  we  can  also  obtain  the  bootstrap  adjusted  inversion  interval 
(Equation  (5.11))  described  at  the  end  of  Section  5.5.  Figure  5.8  shows  a  histogram 
of  the  R  =  9,  999  bootstrap  replicates  of  W.  As  can  be  seen,  the  estimated  sampling 
distribution  of  W  is  reasonably  normal.  The  necessary  quantiles  to  compute  J*A\{x) 
are  <2q  025  =  —  2  and  <2o.975  =  1-98  (compare  these  to  the  corresponding  quantiles  from 
a  standard  normal  distribution;  i.e.,  ±1.96)).  Making  the  necessary  adjustments  to 
the  code  in  Example  5.8.2  yields  J*A\{x)  =  (57.94, 138.16),  which  is  only  slightly  wider 
than  the  unadjusted  inversion  interval  based  on  the  normal  approximation.  In  short, 
the  normal  approximation  works  quite  well  for  these  data. 
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Figure  5.8:  Bootstrap  distribution  of  W  for  the  bladder  volume  example.  A  standard 
normal  distribution  (solid  black  curve)  is  shown  for  comparison. 
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Finally,  we  compute  the  bootstrap  distribution  of  volume0  directly  using 
Algorithm  2.  Our  results  are  based  on  a  bootstrap  simulation  of  size  R  =  9, 999.  Since 
we  do  not  expect  the  sampling  distribution  of  volumeo  to  be  symmetric,  we  provide 
only  a  95%  percentile  bootstrap  interval,  that  is,  the  interval  obtained  by  taking  the 
lower  2.5  and  upper  97.5  percentiles  of  the  bootstrap  distribution.  The  resulting 
interval,  (58.37,136.02),  is  quite  close  to  the  inversion  interval  previously  obtained. 
A  histogram  of  the  9,  999  bootstrap  replicates  of  volumeo  is  given  in  Figure  5.9.  As 
indicated  by  the  normal  Q-Q  plot  in  Figure  5.9,  the  bootstrap  distribution  is  positively 
skewed  (the  estimated  skewness  is  0.3011),  providing  further  support  for  the  inversion 
and  percentile  intervals  over  the  symmetric  Wald-based  interval  obtained  earlier  using 
the  delta  method. 


Figure  5.9:  Bootstrap  distribution  of  volume0  obtained  using  Algorithm  2.  Left: 
Bootstrap  distribution  of  volume0  obtained  using  Algorithm  2.  The  solid  black  line 
indicates  the  original  point  estimate  and  the  dotted  red  lines  indicate  the  positions  of 
the  sample  2.5  and  97.5  quantiles  of  the  bootstrap  distribution.  Right:  Normal  Q-Q 
plot  of  the  bootstrap  replicates  of  volumeo- 
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5.9  Discussion 


We  have  described  a  number  of  techniques  for  controlled  calibration  in  a  (linear) 
mixed  model  setting.  The  Wald-based  interval  is  the  simplest,  but  relies  on  the 
asymptotic  normality  of  x0  along  with  a  Taylor-series  approximation  of  its  variance. 
Perhaps  the  biggest  drawback  to  using  a  Wald-based  calibration  interval  is  that 
it  is  always  symmetric  about  Xq.  While  this  is  appealing  to  many  researchers,  it 
is  not  very  realistic  in  standard  situations  where,  say,  the  data  exhibit  nonlinear 
behavior  (possibly  due  to  horizontal  asymptotes)  and  nonconstant  variance.  The 
asymptotic  normality  of  x0  may  also  be  questioned  when  x0  is  not  the  ML  estimate 
(as  in  the  bladder  volume  example).  This  is  akin  to  using  the  Wald-based  method 
for  nonlinear  calibration  problems  (the  software  JMP  does  this).  Nonetheless,  the 
estimated  sampling  distribution  of  x0  displayed  in  Figures  5.3  and  5.9  using  the 
parametric  bootstrap  are  both  reasonably  normal.  Thus,  the  Wald-based  approach 
may  still  produce  reliable  inference  when  the  distribution  of  x0  can  be  considered 
symmetric.  Although  more  difficult  to  obtain,  the  inversion  and  parametric  bootstrap 
intervals  are  presumably  superior  to  the  Wald-based  interval. 

All  the  methods  we  have  proposed  rely  on  certain  assumptions,  for  example, 
normality  and  large  sample  size.  These  assumptions  may  or  may  not  hold  in  practice. 
If  the  random  effects  and  errors  are  not  normally  distributed,  then  it  is  possible  to  fit  a 
non-Gaussian  mixed  model  (Jiang,  2007,  p.  8).  In  this  situation,  we  can  still  calculate 
a  calibration  interval  by  inverting  a  corresponding  distribution-free  prediction  interval 
with  asymptotic  coverage  probability  1  —  a.  As  we  have  shown  in  Section  5.6,  this 
is  rather  simple  to  do  for  standard  LMMs.  If,  on  the  other  hand,  the  data  are  not 
normal  and  the  sample  size  is  rather  small,  then  there  is  not  much  we  can  do  with 
the  methods  discussed  in  this  chapter. 
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Future  work  on  data  like  these  might  combine  the  nonparametric  method  of 
calibration  proposed  in  Chapter  4  with  the  application  to  grouped  data  discussed  in 
this  chapter.  In  particular,  we  would  recognize  the  grouped  structure  of  the  data 
but  allow  each  specimen  to  have  its  own  (nonlinear)  trend  by  introducing  subject- 
specific  spline  terms  which  may  or  may  not  have  corresponding  random  effects.  This 
may  seem  far-fetched  at  first,  but  remember  that  the  specific  approach  we  took  for 
nonparametric  calibration  is  already  based  on  an  LMM  (4.2).  This  is  an  obvious,  and 
likely  promising,  area  of  future  research,  and  we  discuss  it,  along  with  some  other 
ideas,  in  the  conclusion  to  this  dissertation. 
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VI.  Conclusions  and  Suggestions  for  Further  Research 


Statistical  calibration  is  an  important  application  of  regression  in  many  areas  of 
science,  for  example:  bioassays,  chemometrics,  and  calibrating  laboratory  equipment. 
For  many  of  these  applications,  the  data  are  inherently  nonlinear  with  no  known 
parametric  form,  or  sometimes  the  data  are  collected  in  such  a  way  that  the 
observations  can  not  be  considered  as  independent.  It  is  necessary,  then,  to  have 
simple  and  general  methods  available  for  calibration  in  these  situations.  This  has 
been  the  main  goal  of  our  research. 

6.1  Conclusions 

We  discussed  (controlled)  semiparametric  calibration  in  Chapter  4.  We  provided 
a  frequentist  approach  to  obtaining  calibration  intervals  that  involved  inverting  bias- 
corrected  prediction  intervals  based  on  the  simple  LMM-based  smoother  described  in 
Ruppert  and  Wand  (2003).  The  coverage  probability  and  length  of  these  intervals 
were  investigated  using  a  small  Monte  Carlo  experiment.  This  experiment  showed 
that  these  intervals  do  in  fact  obtain  coverage  probability  close  to  the  nominal  1  —  a 
level  without  sacrificing  length.  The  experiment  also  highlighted  that  correcting  for 
bias  is  more  serious  for  calibration  with  respect  to  a  mean  response  (i.e. ,  regulation). 
A  simple  Bayesian  analog  was  also  proposed  that  has  the  benefit  of  providing  the 
entire  posterior  distribution  of  xq.  We  illustrated  these  methods  using  real  data 
analysis  examples. 

In  Chapter  5,  we  extended  the  usual  methods  of  (controlled)  calibration  (i.e., 
point  estimation  and  obtaining  Wald-based/inversion  intervals)  for  grouped  data 
using  the  LMM.  We  also  proposed  a  parametric  bootstrap  algorithm  for  controlled 
calibration  in  the  LMM.  This  algorithm  can  be  used  to  obtain  confidence  intervals 
for  the  unknown  Xq  directly  from  the  estimated  sampling  distribution  of  the  point 
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estimator  x0>  or  by  improving  the  inversion  interval  by  removing  the  normality 
constraint  on  the  approximate  predictive  pivot  W.  These  strategies  were  illustrated 
using  real  data  analysis  examples.  We  also  briefly  described  how  to  use  a  distribution- 
free  prediction  interval  to  obtain  a  distribution-free  inversion  interval  for  the  unknown 
Xo  in  closed  form  for  the  random  intercept  model. 

Calibration  will  always  remain  an  important  topic  in  statistics.  We  list  here  some 
possible  topics  for  future  research  based  on  extending  the  ideas  in  Chapters  4  and  5: 

•  Semiparametric  calibration  with  constraints; 

•  Prior  selection  for  Xo  in  semiparametric  calibration; 

•  Bootstrap  for  (controlled)  semiparametric  calibration; 

•  Calibration  in  NLMMs; 

•  Semiparametric  calibration  with  random  coefficients. 

For  the  most  part,  these  topics  were  discussed  in  the  conclusions  to  Chapters  4  and 

5. 
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Appendix  A:  Proofs 


In  this  appendix,  we  provide  the  ”  extra  steps”  for  deriving  some  of  the 
mathematical  results  presented  in  this  dissertation. 

A.l  Conditional  posterior  of  ((3,  a.) 

Note  that  the  kernel  of  a  multivariate  normal  distribution  with  mean  vector  p 
and  variance-covariance  matrix  X  is 

K  (cc;  p,  X)  oc  exp  |  —  -  (x  —  y)'  X-1  ( x  —  y) 

Furthermore,  let  x  and  6  be  n  x  1  vectors,  and  A  be  an  invertible  n  x  n  symmetric 
matrix.  We  can  complete  the  square  for  the  quadratic  form  x' Ax  —  Q'x  by  writing 

x! Ax  —  6'x  =  (x  —  y)'  A (x  —  y)  +  C, 


where 


y  =  -A  16  and  C  =  — O'  A  16. 
2  4 


Let  the  vectors  x0  and  z0  have  the  same  form  as  the  i-th  rows  of  X  and  Z  in 
Equation  (4.2),  respectively,  but  with  Xi  replaced  with  xq.  Ignoring  the  constant  of 
proportionality,  the  conditional  posterior  of  (/3,  ck)  is 


7T  ((3,a\y,  y0 ,  a2,  a2a ,  xO)  oc  tt  (y\(3,  a,  cr2,  a2a ,  x0)  vr  (y0\(3,  ct,  a2,  o\,  x0)  vr  (/3)  it  (a\a2a) 


oc  exp 

=  exp 

=  exp 

=  exp 


2(Te 

1 

’2czf 

1 

2of 

1 

'2^2 


\y  —  X/3  —  Zct\ 


20-2 


a 


2^2  [2/0  -  Kxo)Y 


2  ,  ^6 


I y  -  X(3  -  Zol II'-  +  [y0  -  n(x0)]  + 


a 


(To 


I y  -  xp  -  Za\\ 2  +  (yo  -  x'0P  -  z'0a f  +  -§-||a 


oy 


1 2/o  —  Aq/3  —  Z0«||2  H — y||« 


o-- 
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where 


y  x  z 

Vo  —  i  X0  —  ,  Z0  — 


are  augmented  data  vectors  and  matrices.  To  show  that  the  conditional  posterior  of 
6  =  ((3',  a')'  is  normal,  note  that 


exp  \  I  \\yQ-  X0f3  -  Z0a  ||2  +  -^r  1 1 « "  “ 


=  expl-  —  \\y*-~X*f3-Z*a\\2 


=  exp  -—2u  -  sr0e\\2 


where,  similar  to  before, 


?  Ao  — 


Z*  _ 

0  — 


{v2J°D  i 


are  augmented  data  vectors  and  matrices  and  fig  =  (Xq\Zq).  Now,  using  basic 
matrix  multiplication, 

exp  _  =  exp  {~^  ~  2v«n«e  + 


« exp  rt  -  2y’a'(l’„e) 


Upon  completing  the  square,  we  get 


exp  j-^  {0'to*0'to*0G  -  2 2/o'^)}  =  exp  {0  -  fig)’ ^  ( G 


where 


r2  \  -i 


Ho  =  (Og'Og)  to*0'y*0  =  to0'to0  +  D  too  yQ,  D  =  diag  {0pxp,  Iqxq}  , 


=  u2  (to*0'to*0)  1  =  a2  (flo'too)-1 . 

Thus,  the  conditional  posterior  of  the  coefficients  0  is  multivariate  normal  with  mean 


vector  fig  and  variance-covariance  matrix  Eg. 
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A. 2  Conditional  posteriors  of  a2  and  of 

Note  that  since  a2  ~  IQ  (a,  6),  then 


vr(ae2)  = 


r(a) 


(®f) 


2\  -(“+!) 


exp 


< 7 1 


Now,  ignoring  the  constant  of  proportionality,  the  conditional  posterior  of  of  is 


*{<?,  e\y,  Vo,  /3,«,  <?l,  Xq)  oc  ir(y\(3,  «,  a2)7r(y0\(3,  «,  a2,  x0)n(a2) 


(  2\ -(“+!) 
oc  (ae  j  exp 

(  20  — (ot+1) 

=  K)  exp 


\y  -  Xf3  -  Zol\\2  (yQ  -  x0f3  -  z0a)2  b 


2 of  2a? 

(§11 2/o  —  X0{3  —  Zqol\\2  +  6) 


<x: 


at 


which  is  proportional  to  the  density  function  of  a  IQ  (a,  |||t/o  —  A0/3  —  1 1 J  +  b) 

distribution.  The  proof  for  the  conditional  posterior  7r(a2\y,  r/0,  (3,  ck,  a2,  x0)  follows 
in  an  analogous  manner. 


A. 3  LMM  log-likelihood 

For  the  LMM  (Equation  (5.1)),  we  have  that 

y  ~  J\f(X{3,  V) , 


hence,  the  density  function  is 

fly)  =  (2JT)-"'2 1 VT 1/2  exp  jf  0>  -  X/3 )' V"1  (y  -X/3)  Q 

Taking  the  logarithm  of  f(y)  gives  the  log-likelihood 

t  =  -\  iog(27T)  -  \  log  (I  V\)  - 1  (y  -  x/3)'  V-'  (y  -  X0) . 

Let  V  have  the  block  diagonal  form  V  =  a2  <  Ini  +  ZiG*Z'i  ^  .  Since  the 

l  diag  )  i=  1 

determinant  of  a  block  diagonal  matrix  is  just  the  product  of  the  determinant  of 
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the  block  diagonals,  then 


log  (I V\)  =  log  m  |<7?  (In,  +  ZiG'Z'j 
u=l 

{m 

II  0T‘  in,  +  z,g-z[  i 
2=1 

m  m 

=  V  n,  log  (ae2)  +  V  log  (I  J„,  +  Z.G'Z'S 


1=1 


1=1 


=  N  log  (<r2)  +  V  log  (I In,  +  Z,G‘Z\ |) . 

i=l 


Similarly,  since  the  inverse  of  a  block  diagonal  matrix  is  another  block  diagonal  matrix, 
composed  of  the  inverse  of  each  block,  it  follows  that 


(y  -  xp)'  V-1  {y  -xp)  =  YJ  Wi  -  XiP)'  V-1  (yt  -  xtp) , 

i=l 


where  V,  =  of  (IHi  +  ZiG*Z'i ).  Therefore,  the  log-likelihood  becomes 


t  =  -f  iog(27r)  -  \  log  (I  V|)  -  i  (y  -  xgy  v-'(y-  xp) 

N  Af  1 

=  — 2  log(27r)  -  y  log(^2)  -  2  51  lo§  +  Z*G*Zil) 

^  m 

~2P-Y.  0>i  -  x.0)’  (I~,  +  ZiG-z- :)-■  (y,  -  Xi/3) . 
^  1  =  1 
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Appendix  B:  The  R  Package  investr 


In  this  appendix,  we  describe  the  investr  package  for  the  R  statistical  software. 
The  name  investr  stands  for  inverse  estimation  in  R.  It  is  currently  listed  on  CRAN 
at  http://cran.r-project.org/web/packages/investr/index.html.  The  source  code  for 
additional  functions  (e.g.,  pspline)  can  be  obtained  from  the  GitHub  development 
site  at  https://github.com/wl08bmg/Research/tree/master/Rcode.  To  install  the 
package,  simply  open  an  R  terminal  and  type: 

install .packages ("investr" ,  dependencies  =  TRUE)  #  install  package 
library (investr)  #  load  package 

Once  the  package  is  loaded,  we  have  access  to  all  of  its  functions,  datasets,  and 
examples.  The  three  main  functions  are  described  in  Table  B. 


Function 

Description 

calibrate 

For  a  vector  of  m  response  values  with  unknown 

predictor  value  x0,  computes  the  classical  estimate  (i.e., 

ML  estimate)  Xq  and  a  corresponding  Wald  or  inversion 

interval  for  the  simple  linear  calibration  problem. 

invest 

For  a  vector  of  m  response  values  with  unknown 

predictor  value  xo,  computes  the  classical  estimate  xq 

and  a  corresponding  Wald  or  inversion  interval  for 

polynomial  and  nonlinear  calibration  problems. 

plotFit 

For  plotting  fitted  regression  models  with  or  without 

confidence/prediction  bands. 

Table  B.l:  Main  functions  from  the  investr  package. 
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B.l  The  plotFit  function 

The  plotFit  function  is  a  general  purpose  function  that  is  also  useful  outside  of 
statistical  calibration  problems.  Its  sole  purpose  is  to  plot  fitted  models  for  R  objects 
of  class  lm  or  class  nls  with  the  option  of  adding  a  confidence  and/or  prediction 
band.  For  example,  the  following  snippet  of  R  code  fits  a  simple  linear  regression 
model  to  the  crystal  data  frame  and  plots  the  data  along  with  the  fitted  regression 
line  and  (pointwise)  95%  confidence  band.  Of  course,  we  can  change  the  default 
95%  confidence  level  by  specifying,  for  example,  level=0.9.  Additionally,  we  can 
also  specify  an  adjustment  for  simultaneous  inference  such  as  Scheffe,  Bonferroni,  or 
Working-Hotelling.  The  second  call  to  plotFit  in  the  code  below  illustrates  the  use 
of  both  of  these  options. 


par(mfrow  =  c(l,  2),  cex.main  =  0.8)  #  side-by-side  plots 

crystal. lm  <-  lm(weight  ~  time,  data  =  crystal)  #  fit  model 
plotFit (crystal . lm,  interval  =  "confidence",  shade  =  T, 

col.conf  =  "skyblue",  main  =  "95%  pointwise  confidence  band") 
plotFit (crystal . lm,  interval  =  "confidence",  shade  =  T, 
col.conf  =  "skyblue",  level  =  0.9,  adjust  =  "W-H" , 
main  =  "90%  Working-Hotelling  band") 


95%  pointwise  confidence  band 


90%  Working-Hotelling  band 
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More  elaborate  models  can  also  be  plotted  in  the  same  way.  For  example,  the  following 
snippet  of  code  fits  a  simple  linear,  quadratic,  cubic,  and  natural  cubic  spline  model 
to  the  well-known  cars  data  frame  and  then  plots  the  corresponding  fits  with  both 
confidence  and  prediction  bands  at  the  95%  level. 


data(cars,  package  =  "datasets")  #  load  cars  data  frame 
library (splines)  #  load  splines  package 
##  Fit  models 

cars. lml  <-  lm(dist  ~  poly (speed,  degree  =  3),  data  =  cars) 
cars.lm2  <-  lm(dist  ~  ns(speed,  df  =  3) ,  data  =  cars) 

##  Plot  models 

par(mfrow  =  c(l,  2))  #  2-by-2  grid  of  plots 

plotFit (cars . lml ,  interval  =  "both",  xlim  =  c(-10,  40), 
ylim  =  c(-50,  150),  main  =  "Cubic  polynomial") 
plotFit (cars . Im2,  interval  =  "both",  xlim  =  c(-10,  40), 
ylim  =  c(-50,  150),  main  =  "Natural  cubic  spline") 


Cubic  polynomial 


Natural  cubic  spline 
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B.2  The  calibrate  function 


The  most  basic  calibration  problem,  the  one  often  encountered  in  more  advanced 
regression  texts,  is  the  simple  linear  calibration  problem  for  which 

yi  =  A)  +  PiXi  +  Q,  e*  ~  J\f  (0,  oy )  ,  i  =  l, 

yoj  =  Po  +  fhxo  +  eoj,  €oj  ~  A/"  (0,  of)  ,  j  =  l,...,m. 

For  example,  consider  the  arsenic  data  introduced  in  Section  3.2.4.  The  following 
snippet  of  code  obtains  a  95%  inversion  interval  and  95%  Wald-based  interval  for 

the  unknown  x0  corresponding  to  y0  =  3  based  on  Equations  (3.12)  and  (3.18), 

respectively: 


calibrate (arsenic . lm,  yO  =  3, 

interval  =  "inversion") 

##  estimate 

lower  upper 

##  2.931 

2.537  3.325 

calibrate (arsenic . lm,  yO  =  3, 

interval  =  "Wald") 

##  estimate 

lower  upper 

se 

##  2.9314 

2.5374  3.3255 

0.1929 

If  instead  we  were  interested  in  the  unknown  Xo  corresponding  to  a  fixed  mean 
response  of  hq  —  3  (i.e.,  a  regulation  problem)  we  would  instead  use 

calibrate (arsenic . lm,  yO  =  3,  interval  =  "inversion", 
mean. response  =  TRUE) 

##  estimate  lower  upper 

##  2.931  2.860  3.002 

B.3  The  invest  function 

In  this  section,  we  describe  the  more  general  function,  invest,  which  can  be  used 
for  more  complex  univariate  calibration  problems  such  as  polynomial  and  nonlinear 
calibration. 


Ill 


For  the  quadratic  linear  model  in  the  whiskey  age  example  of  Section4.2.1,  we 
used  the  following  code  to  obtain  a  95%  inversion  interval  for  the  unknown  age 
corresponding  to  sample  with  a  known  proof  of  108: 

whiskey  <-  data. frame (age  =  c(0,  0.5,  1,  2,  3,  4,  5, 

6,  7,  8),  proof  =  c (104 . 6 ,  104.1,  104.4,  105,  106, 

106.8,  107.7,  108.7,  110.6,  112.1)) 
whiskey. lm  <-  lm(proof  ~  age  +  I(age~2),  data  =  whiskey) 
invest (whiskey . lm,  yO  =  108) 

##  estimate  lower  upper 

##  5.233  4.678  5.735 


As  for  a  nonlinear  regression  example,  we  consider  the  nasturtium  example  of 
Section  3.4.1.  The  following  snippet  of  code  fits  the  log-logistic  regression  function 


A,  An  AO  =  < 

^  P1/  [1  +  exp  {/32  +  As  ln(z)}] ,  x  >  0 

to  the  data  and  obtains  both  a  95%  inversion  interval  and  95%  Wald-based  interval 
for  the  unknown  concentration  corresponding  to  the  three  unknowns  309,  296,  and 
419: 


nas.nls  <-  nls(weight  ~  ifelse(conc  ==  0,  thetal,  thetal/(l  + 
exp(theta2  +  theta3  *  log(conc) ) ) ) ,  data  =  nasturtium, 
start  =  list (thetal  =  1000,  theta2  =  0,  theta3  =1)) 
invest (nas .nls ,  yO  =  c(309,  296,  419),  interval  =  "inversion") 

##  estimate  lower  upper 
##  2.264  1.772  2.969 


invest (nas. nls,  yO  =  c(309,  296,  419),  interval  =  "Wald") 

##  estimate  lower  upper  se 

##  2.2639  1.6889  2.8388  0.2847 


Bootstrap  approaches  to  obtaining  calibration  intervals  are  currently  not 
available  in  the  investr  package,  however,  a  future  release  is  likely  to  contain  some 
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bootstrap  functionality.  Until  such  time,  the  well-known  boot  package  can  be  used 
to  obtain  the  bootstrap  calibration  intervals  described  in  Chapter  3.  The  bootMer 
function  from  the  R  package  Ime4  (>  1.0  —  5)  can  be  used  to  obtain  the  parametric 
bootstrap  calibration  intervals  discussed  in  Chapter  5. 
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